Today we will talk about Multicast. This network protocol mechanism is used heavily in the trading world. Multicast is especially valuable in market data distribution because all trading servers should receive the exact same packet stream at nearly the same time (fairness), without the exchange having to maintain thousands of independent TCP connections (efficiency).

In particular we will be focusing on Protocol-Independent IP Multicast-Sparse Mode (PIM-SM) as a widely deployed multicast routing protocol, although in HFT environments it is often simplified or combined with SSM and L2 optimizations. PIM is called “protocol-independent” because it does not depend on a specific unicast routing protocol. It can leverage existing routing tables learned through OSPF, IS-IS, BGP, or other routing protocols.

Notice we will focus on the high-level idea instead of specific implementation details since I am a software engineer not a network engineer :)

What is Multicast

send-model

When we think about sending network packets, the first communication model that usually comes to mind is unicast. For example, TCP establishes a reliable bidirectional connection between a source and destination. However, sometimes 1 source wants to send messages to a group of destinations. The brute-force approach is to establish N separate unicast flows between the source and each destination.

Note the difference between broadcast and multicast: you can send a packet to a group, even if you yourself are not a member of that group.

Below is an illustration of how we could use unicast to implement multicast. We can see for a group of 3 receivers, the identical packet is replicated 3 times and crowded through the 2 switches.

unicast

We need a more bandwidth-efficient protocol to implement multicast. Ideally, such 2 requirements should be achieved:

  • A router forwards a multicast packet at most once per outgoing interface that has downstream interested receivers

  • Hosts not in the interested group should not receive the packet

Conceptually, multicast routing builds a distribution tree inside the network. The key idea behind multicast is that packets are replicated only at branching points in the network instead of at the sender itself.

The addresses from 224.0.0.0 to 239.255.255.255 are reserved for multicast addresses. So you cannot use an address in this range as an individual IP address.

IP Multicast

The IP multicast service model defines three operations for end hosts:

  1. send packets to a group (even if are not a part of that group yourself).
  2. announce that you are joining a group.
  3. announce that you are leaving a group.

The end hosts do not need to consider how is the multicast packet delivered to everyone in that group. The multicast routing infrastructure attempts to efficiently distribute packets to all subscribed members of the group.

Host <-> First-Hop Router

IGMP (Internet Group Management Protocol) serves to let first-hop router know which end host is a member of which multicast group.

On a high level, the router will periodically query end host “which group(s) do you belong to?” and end host will reply with a list of multicast groups that it is interested in subscribing to. End hosts could also send unsolicited report about group membership to the router.

In this way, the first-hop router stays informed about the latest multicast group membership subscription of each end host. If the router doesn’t receive an update about a membership for a long time, the router will assume that membership has expired and invalidate it.

Routing

Now we will talk about how each router learns how and where to route a multicast packet from source to a group of destination in PIM-SM. It has 2 phases: rendezvous point (RP) routing and SPT switch over.

Rendezvous point (RP) act like a temporary meeting place between multicast sources and multicast receivers. So the router first establishes a shared multicast tree rooted at RP before building and switching to a shortest-path tree directly towards the source.

We will use the following topology example for illustration:

  • multicast group G = 239.1.1.1
  • H1 and H2 are interested in joining G while H3 is not

base_topology

Join the Group

First, hosts interested in G inform the first-hop routers using IGMP:

H1 -> R1: "I want traffic for G"
H2 -> R1: "I want traffic for G"

Rendezvous Point Routing

RP is pre-chosen during bootstrap time or manually configured. First hop routers send multicast join messages toward the RP:

R1 -> RP: JOIN(*, G)

The notatio (*, G) means traffic from ANY source to multicast group G.

At this point, say the source send a multicast message to the group G, the traffic flow will follow

                 /→ H1
Source → RP → R1
                 \→ H2

base_topology

Notice the detour to RP is actually wasted work since Source is directly connected with R1.

SPT Switch Over

After the multicast traffic starts flowing, router may decide to switch from a shared RP tree to shorest-path tree (SPT) rooted directly at the source.

Router R1 realizes Source -> R1 is shortest than Source -> RP -> R1 so it sends a source-specific multicast join toward the source

R1 -> Source: JOIN(S, G)

After this switch over, the traffic flow becomes

            /→ H1
Source → R1
            \→ H2

base_topology

Notice the packet is only transmitted once through the link Source -> R1 and only replicated at the last layer at R1 before branching out to H1 and H2. And this replication could be done by optimzied hardware ASIC instead of software in the router to achieve further latency optimization.

Conclusion

Multicast provides an efficient way to distribute the same data stream to multiple receivers without duplicating traffic at the source. Instead of pushing N identical unicast streams, the network builds a shared distribution tree and replicates packets only where paths diverge.

In practice, multicast trades off simplicity at the application level for complexity in the network layer — but for systems like market data distribution where latency and bandwidth efficiency matter, this trade-off is worth it.

Reference

  1. Berkeley CS 168 Textbook

  2. IGMP protocol

  3. PIM-SM sparse mode