Motivation: Sometimes we want to send the same data to multiple destinations (for example, to support radio, TV, meetings, lectures, news & software distribution, etc. over the Internet). If the source sends a copy to each destination, many copies will traverse the links near the source over and over again, which limits the amount of data that can be sent. Imagine if host A wants to send the same data to hosts B..H: A B | | C--[R1]--[R2]--D | | | | E--[R3]--[R4]--F | | G H If the link from A to R1 is 10 Mbps, then A can effectively send at only 10/7 Mbps, because it must send 7 copies of everything. It would be nice if the source could send a single copy, and let the routers duplicate it at appropriate places within the network, so that only one copy traverses any link. In the above example, R1 receives a copy from A, and would forward copies to C, R3, and R2. R2 would forward copies to B and D, while R3 would forward copies to E and G. And either R2 or R3 (but hopefully not both) would forward a copy to R4, which would forward copies to F and H. The branching path from the source to the receivers should be a tree (to avoid duplication), and is called a distribution tree. IP multicast service model: (Steve Deering, RFC 1112, 1989) Any host may send a packet to a group by using a group address (class D address, beginning 1110) as the destination address. (Depending on the capabilities of the link layer, a multicast link-layer address might be constructed from the multicast IP address, or the packet might be broadcast on the link, or sent to a default router.) The routers will try to forward a copy of the packet to every host that is subscribed to the group address. Multicast is best-effort, like unicast (normal IP forwarding). End-hosts join (subscribe to) and leave (unsubscribe from) multicast groups using the Internet Group Management Protocol (IGMP). Basically, they announce their joining and leaving intentions on their local networks, so that the routers know who belongs to the group. Notice that a host must join a group in order to receive packets sent to the group, but need not be a member in order to send packets to the group. Multicast routing: Suppose host H subscribes to a multicast address, and host A sends a packet to that multicast address. R1 sees the multicast address in the packet, but doesn't know that H is subscribed. R4 knows that H is subscribed, but doesn't know who is sending to the group. How can the packet get from R1 to R4? Flood and prune: One of the first methods was the Distance-Vector Multicast Routing Protocol (DVMRP), which used the flood-and-prune technique. A packet sent to any multicast address is forwarded to all routers. In the past we've seen a flooding technique in which every node remembers which messages it has already seen, to avoid forwarding the same message twice, but that would be impractical for IP packets, so DVMRP uses reverse path forwarding: Each router knows the next hop toward any unicast address. When a router receives a multicast packet from the interface leading to the source address, it forwards the packet on all its other interfaces, otherwise it does not forward it. For each source, the there is a shortest-path tree leading from every node to that source, and this tree is used in the opposite direction to carry multicast packets from that source to every node. As an optimization, routers that wish to stop receiving multicast packets with particular source/group addresses can send prune messages to their upstream routers (and can later send graft messages if they change their mind). But when a source first sends a multicast packet to a group address, the router has no way of knowing where the members are, so it must flood. Obviously, flooding multicast packets to everyone won't scale up to large numbers of groups and sources over the whole Internet. Shared trees: In reverse-path forwarding there is a separate tree for each source. If we wish to avoid flooding, we must find some other way to connect the sources to the group members. We can pick a unicast address, known to both the routers near the members and the routers near the sources, as the root of a shared tree, whose leaves are the routers adjacent to the group members. (The root is called a rendezvous point in Protocol Independent Multicast (PIM), or a core in Core Based Trees (CBT).) When a source sends a packet to the group, the packet is first forwarded to the root, then along the tree to all the members. When a member joins a multicast group (using IGMP), its local router sends a join request toward the root, which builds a new branch of the distribution tree, stopping when it hits the existing tree. Notice that the packets sent by a source are no longer taking the shortest path. As an optimization, if a few sources are sending a lot of traffic, the leaf routers can choose to build additional source-specific trees just for those sources, by sending source-specific join requests toward the sources. There is still the problem of how the routers know the root address for a given multicast address. They could be configured with a table or a hash function. This is more scalable than flooding, but there is still a problem. When a source and destination are within the same autonomous system (AS), the AS prefers to use a route completely within the AS, to control costs, control quality of service, and avoid being adversely affected by routing problems beyond its control. But with a shared multicast tree, the packets always go through the AS containing the root, even on the way from sources in another AS to members in the that same AS: +------------------+ | AS1 | | root | | / \ | +------/----\------+ / \ +----/--------\----+ | source member | | | | AS2 | +------------------+ Interdomain multicast routing: So each autonomous system can use DVMRP or PIM or CBT internally, but we need an interdomain multicast routing protocol to connect them. The Border Gateway Multicast Protocol (BGMP) builds a bidirectional tree of autonomous systems for each group, where the root of the tree is an entire AS. Packets sent by a source in some AS go to members in that AS via the internal multicast routing, and go to other AS's via the BGMP tree. The tree is bidirectional, so packets don't need to go to the root AS first, they just spread outward from the source's AS to the others along the tree. It's still necessary for all border routers to know, for any given multicast address, which AS is the root AS for that group. If multicast addresses are allocated in blocks to each AS, then the border routers can maintain a table listing which blocks belong to which AS's. The Multicast Address Set Claim (MASC) protocol is proposed to do dynamic allocation of multicast addresses to AS's. Mbone: Multicast is not enabled in the routers of most ISPs, because they would risk losing control over how much traffic flows through them. At present, there are islands of multicast-enabled routing domains (like many university campuses), interconnected by manually configured tunnels, in which unicast forwarding is used over a multihop path as a virtual link between two multicast-capable routers. This global virtual network is called the multicast backbone (Mbone). Multicast transport: One big challenge is congestion control. If a source sends at a particular rate, that may be too fast for the links on the paths to some group members, but not others. The source could slow down, which improves the situation for some members (who are no longer experiencing packet losses), but worsens it for others (who now have to settle for a slower connection, when they were perfectly happy with the faster rate). For streaming media applications, Receiver-driven Layered Multicast (RLM) is a useful technique. The source video or audio is divided into layers (not to be confused with protocol stack layers), so that the "lowest" layer carries a low-quality version of the signal, and each subsequent layer carries additional information that, when combined with the lower layers, yields a higher-quality version of the signal. Each layer can be sent to a different multicast group address, and members can subscribe to as many groups as they can before they start losing packets. For applications requiring reliability, like distribution of news articles or shared whiteboards, loss recovery is a challenge. When a packet is lost in the middle of the network, some members fail to receive it, but other members do receive it. Should the source unicast a retransmission to every member who didn't get it, or multicast a retransmission to the entire group? For large groups, both approaches may be very inefficient. This is an area of active research.