Wiley IP Multicast With Applications To IPTV and Mobile
Wiley IP Multicast With Applications To IPTV and Mobile
Wiley IP Multicast With Applications To IPTV and Mobile
APPLICATIONS TO IPTV
AND MOBILE DVB-H
Daniel Minoli
Daniel Minoli
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by
any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted
under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission
of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at
www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions
Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-
6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or completeness
of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for
a particular purpose. No warranty may be created or extended by sales representatives or written sales
materials. The advice and strategies contained herein may not be suitable for your situation. You should consult
with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or
any other commercial damages, including but not limited to special, incidental, consequential, or other
damages.
For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-
3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Also thanking
Mike Neen
CONTENTS
Preface xiii
About the Author xv
1 INTRODUCTION TO IP MULTICAST 1
1.1 Introduction 1
1.2 Why Multicast Protocols are Wanted/Needed 3
1.3 Basic Multicast Protocols and Concepts 5
1.4 IPTV and DVB-H Applications 11
1.5 Course of Investigation 21
Appendix 1.A: Multicast IETF Request for Comments 21
Appendix 1.B: Multicast Bibliography 23
References 23
vii
viii CONTENTS
7 MULTICAST ROUTING—DENSE-MODE
PROTOCOLS: PIM DM 152
7.1 Overview 152
7.2 Basic PIM DM Behavior 153
7.3 Protocol Specification 155
7.3.1 PIM Protocol State 156
7.3.2 Data Packet Forwarding Rules 158
7.3.3 Hello Messages 159
7.3.4 PIM DM Prune, Join, and Graft Messages 160
7.3.5 State Refresh 170
7.3.6 PIM Assert Messages 175
7.3.7 PIM Packet Formats 182
References 184
Glossary 319
Index 349
PREFACE
This book updates early-release published work undertaken by the author in the
early-to-mid-1990s on the topic of video-for-telcos (‘‘telco TV”), video-over-packet,
video-over-DLS, and video-over-ATM contained in the book Video Dialtone Technol-
ogy: Digital Video over ADSL, HFC, FTTC, and ATM, McGraw-Hill, 1995, and based
on extensive hands-on work on broadband communications and digital video/digital
imaging. At this juncture, the focus of this book (and for this industry) is completely on
commercial-quality video over IP, IPTV.
Of late there has been renewed interest in IP multicast protocols and technologies
because of the desire by traditional telephone companies to deliver entertainment-level
video services over their network using next-generation infrastructures based on IP
networking, by the cell phone companies for video streams to hand held telephone sets
and personal digital assistants (PDAs), and by the traditional TV broadcast companies
seeking to enter the same mobile video market. Critical factors in multicasting include
bandwidth efficiency and delivery tree topology optimization.
IP multicast technology is stable and relatively easy to implement, particularly for
architecturally simple (yet large) networks. A lot of the basic IP multicast mechanisms
were developed in the mid-to-late 1980s, with other basic work undertaken in the 1980s.
A number of recent functional enhancements have been added. From a commercial
deployment perspective, IP multicast is now where IP was in the mid-1990s: poised to
take off and experience widespread deployment. Examples of applications requiring
one-to-many or many-to-many communications include but are not limited to digital
entertainment video and audio distribution, multisite corporate videoconferencing,
broad distribution financial data, stock quotes and news bulletins, database replication,
software distribution, and content caching (for example, Web site caching).
The text literature on IP multicast is limited and somewhat dated, particularly in
reference to IPTV applications. This compact text is intended for practitioners that seek
a quick practical review of the topic with emphasis on the major and most-often used
aspects of the technology. Given its focus on IPTV and DVB-H it can also be used by
technology integrators and service providers that wish to enter this field.
Following an introductory discussion in Chapter 1, Chapter 2 covers multicast
addressing for payload distribution. Chapter 3 focuses on multicast payload forwarding.
Chapter 4 covers the important topic of dynamic host registration using the Internet
Group Management Protocol. Chapter 5 looks at multicast routing in sparse-mode
environments and the broadly used PIM-SM. Chapter 6 discusses CBT. Chapter 7 looks
at multicast routing for dense-mode protocols and PIM-DM in particular. Chapter 8
xiii
xiv PREFACE
examines DVMRP and MOSPF. The next chapter, Chapter 9, covers IP multicasting in
IPv6 environments. Chapter 10 looks at Multicast Listener Discovery (MLD) snooping
switches. Finally, Chapters 11 and 12 give examples in the IPTV and (mobile) DVB-H
environments, respectively. Portions of the presentation are pivoted off and summarized
from fundamental RFCs; other key sections are developed here for the first time, based
on the author’s multidecade experience in digital video. The reference RFCs and
protocols are placed in the proper context of a commercial-grade infrastructure for the
delivery of robust, entertainment-quality linear and nonlinear video programming.
Telephone carriers (telcos), cell phone companies, traditional TV broadcasters,
cable TV companies, equipment manufacturers, content providers, content aggregators,
satellite companies, venture capitalists, and colleges and technical schools can make use
of this text. The text can be used for a college course on IP multicast and/or IPTV. There
is now a global interest by all the telcos in Europe, Asia, and North America to enter the
IPTV and DVB-H market in order to replace revenues that have eroded to cable TV
companies and wireless providers. Nearly all the traditional telcos worldwide are
looking into these technologies at this juncture. Telcos need to compete with cable
companies and IPTV and DVB-H is the way to do it. In fact, even the cable TV
companies themselves are looking into upgrading their ATM technology to IP. This
book is a brand-new look at the IP multicast space.
ABOUT THE AUTHOR
Daniel Minoli has many years of technical hands-on and managerial experience
(including budget and/or PL responsibility) in networking, telecom, video, enterprise
architecture, and security for global best-in-class carriers and financial companies. He
has worked at AIG, ARPA think tanks, Bell Telephone Laboratories, ITT, Prudential
Securities, Bell Communications Research (now Telcordia), AT&T, Capital One
Financial, and SES AMERICOM, where he is director of terrestrial systems engineer-
ing. Previously, he also played a founding role in the launching of two companies
through the high-tech incubator Leading Edge Networks Inc., which he ran in the early
2000s; Global Wireless Services, a provider of secure broadband hotspot mobile
Internet and hotspot VoIP services; and InfoPort Communications Group, an optical
and Gigabit Ethernet metropolitan carrier supporting Data Center/SAN/channel exten-
sion and Grid Computing network access services.
For several years he has been Session-, Tutorial-, or overall Technical Program
Chair for the IEEE ENTNET (Enterprise Networking) conference. ENTNET focuses on
enterprise networking requirements for large financial firms and other corporate
institutions.
At SES AMERICOM, Mr. Minoli has been responsible for engineering satellite-
based IPTV and DVB-H systems. This included overall engineering design, deploy-
ment, and operation of SD/HD encoding, inner/outer AES encryption, Conditional
Access Systems, video middleware, Set Top boxes, Headends, and related terrestrial
connectivity. At Bellcore/Telcordia, he did extensive work on broadband; on video-on-
demand for the RBOCs (then known as Video Dialtone); on multimedia over ISDN/
ATM; and on distance learning (satellite) networks. At DVI he deployed (satellite-
based) distance-learning system for William Patterson College. At Stevens Institute of
Technology (Adjunct), he taught about a dozen graduate courses on digital video. At
AT&T, he deployed large broadband networks also to support video applications, for
example, video over ATM. At Capital One, he was involved with the deployment of
corporate Video-on-demand over the IP-based intranet. As a consultant he handled the
technology-assessment function of several high-tech companies seeking funding,
developing multimedia, digital video, physical layer switching, VSATs, telemedicine,
Java-based CTI, VoFR & VPNs, HDTV, optical chips, H.323 gateways, nanofabrication/
(Quantum Cascade Lasers), wireless, and TMN mediation.
Mr. Minoli has also written columns for ComputerWorld, NetworkWorld, and
Network Computing (1985–2006). He has taught at New York University (Information
Technology Institute), Rutgers University, Stevens Institute of Technology, and
xv
xvi ABOUT THE AUTHOR
1.1 INTRODUCTION
Although “not much” new has occurred in the “science” of the Internet Protocol (IP)
multicast space in the past few years, there is now keen interest in this technology because
of the desire by traditional telephone companies to deliver entertainment-level video
services over their networks using next-generation infrastructures based on IP
networking and by the cell phone companies to deliver video streams to handheld
telephone sets and Personal Digital Assistants (PDAs). A critical factor in multicasting is
bandwidth efficiency in the transport network. IP multicast, defined originally in RFC
988 (Request for Comments) (1986) and then further refined in RFC 1054 (1988), RFC
1112 (1989), RFC 2236 (1977), RFC 3376 (2002), and RFC 4604 (2006), among others,
is the basic mechanism for these now-emerging applications. The technology is stable
and relatively well understood, particularly for architecturally simple (yet large)
networks.
Even in spite of the opening statement above, enhancements to IP multicast have
actually occurred in the recent past, including the issuing of Internet Group Management
Protocol (IGMP), Version 3 (October 2002); the issuing of Multicast Listener Discovery
1
2 INTRODUCTION TO IP MULTICAST
(MLD), Version 2 for IP, Version 6 (IPv6) (June 2004); the issuing of Source-Specific
Multicast (SSM) for IP (August 2006); and the publication of new considerations for
IGMP and MLD snooping switches (May 2006). Work is also underway to develop
new protocols and architectures to enable better deployment of IP over Moving Pictures
Expert Group 2 (MPEG-2) transport and provide easier interworking with IP networks.
From a commercial deployment perspective, IP multicast is now where IP was in the
mid-1990s: poised to take off and experience widespread deployment. Examples of
applications requiring one-to-many or many-to-many communications include, but are
not limited to, digital entertainment video and audio distribution, multisite corporate
videoconferencing, broad-distribution financial data, grid computing, stock quotes and
news bulletins distribution, database replication, software distribution, and content
caching (e.g., Web site caching).
This book provides a concise guide to the IP multicast technology and its applica-
tions. It is an updated survey of the field with the underlying focus on IP-based Television
(IPTV)1 (also known in some quarters as telco TV) and Digital Video Broadcast—
Handheld (DVB-H) applications.
IPTV deals with approaches, technologies, and protocols to deliver commercial-
grade Standard-Definition (SD) and High-Definition (HD) entertainment-quality real-
time linear and on-demand video content over IP-based networks, while meeting all
prerequisite Quality of Service (QoS), Quality of Experience (QoE), Conditional Access
(CA) (security), blackout management (for sporting events), Emergency Alert System
(EAS), closed captions, parental controls, Nielsen rating collection, secondary audio
channel, picture-in-picture, and guide data requirements of the content providers and/or
regulatory entities. Typically, IPTV makes use of Moving Pictures Expert Group 4
(MPEG-4) encoding to deliver 200–300 SD channels and 20–40 HD channels; viewers
need to be able to switch channels within 2 s or less; also, the need exists to support
multi-set-top boxes/multiprogramming (say 2–4) within a single domicile. IPTV is not to
be confused with simple delivery of video over an IP network (including video
streaming), which has been possible for over two decades; IPTV supports all business,
billing, provisioning, and content protection requirements that are associated with
commercial video distribution. IP-based service needs to be comparable to that received
over cable TVor direct broadcast satellite. In addition to TV sets, the content may also be
delivered to a personal computer. MPEG-4, which operates at 2.5 Mbps for SD video and
8–11 Mbps for HD video, is critical to telco-based video delivery over a copper-based
plant because of the bandwidth limitations of that plant, particularly when multiple
simultaneous streams need to be delivered to a domicile; MPEG-2 would typically
require a higher bit rate for the same perceived video quality. IP multicast is typically
employed to support IPTV.2
1
Some also use the expansion “IPTV (Internet TV),” e.g., CHA 200701. We retain the more general
perspective of IPTV as TV (video, video on demand, etc.) distributed over any kind of IP-based network
(including possibly the Internet).
2
While some have advanced Peer-to-Peer (P2P) models for IPTV (e.g., see CHA 200701), nearly all the
commercial deployment to date is based on the classical client–server model; this is the model discussed in
this book.
WHY MULTICAST PROTOCOLS ARE WANTED/NEEDED 3
3
Currently a typical digital TV package may consist of 200–250 SD signals each operating at 3 Mbps and
30–40 HD signals each operating at 12 Mbps; this equates to about 1 Gbps; as more HDTV signals are added,
the bandwidth will reach in the range of 2 Gbps.
4 INTRODUCTION TO IP MULTICAST
users. If a source had to deliver one Gbps of signal to, say, one million receivers by
transmitting all of this bandwidth across the core network, it would require a petabit–
per-second network fabric; this is not currently possible. On the contrary, if the source
could send the 1 Gbps of traffic to (say) 50 remote distribution points (e.g., headends),
each of which then makes use of a local distribution network to reach 20,000 subscribers,
the core network needs to support 50 Gbps only, which is possible with proper design. For
these kinds of reasons, IP multicast is seen as a bandwidth-conserving technology that
optimizes traffic management by simultaneously delivering a stream of information to a
large population of recipients, including corporate enterprise users and residential
customers. See Figure 1.1 for a pictorial example.
Traditional unicast IP
R
R
Multicast IP
R
R
S = Source
R = Receiver
4
A program in this context equates to a video channel, more specifically to an MPEG-2/4 transport stream
with a given Program ID (PID) (this topic is revisited in Chapters 2 and 11).
6 INTRODUCTION TO IP MULTICAST
addresses have a hex “01” in the first byte of the six-octet destination address.
The Internet Assigned Numbers Authority (IANA) manages the assignment of IP
addresses at layer 3, and it has assigned the (original) Class D address space to be used
for IP multicast. A Class D address consists of 1110 as the higher order bits in the first
octet, followed by a 28-bit group address. A 1110-0000 address in the first byte starts
at 224 in the dotted decimal notation; a typical address might be 224.10.10.1, and so
on. All IP multicast group addresses belong to the range 224.0.0.0–239.255.255.255.
In addition, all IPv6 hosts are required to support multicasting. The mapping of
IP multicast addresses to Ethernet addresses takes the lower 23 bits of the Class D
address and maps them into a block of Ethernet addresses that have been allocated for
multicast.
Dynamic Host Registration—There must be a mechanism that informs the network
that a host (receiver) is a member of a particular group (otherwise, the network would
have to flood rather than multicast the transmissions for each group). For IP networks, the
IGMP serves this purpose.
Multicast Payload Forwarding—Typical IP multicast applications make use of User
Datagram Protocol (UDP) at the transport layer and IP at the network layer. UDP is the
“best effort delivery” protocol with no guarantee of delivery; it also lacks the congestion
management mechanism [such as those utilized in Transmission Control Protocol
(TCP)]. Real-time applications such as commercial live video distribution do not (and
cannot) make use of a retransmission mechanism (such as the one utilized in TCP). In
some cases, portions of the network may be simplex (such as a satellite link), practically
precluding end-to-end retransmission. Hence, the risk exists for audio and video
broadcasts to suffer content degradation due to packet loss. To minimize lost packets,
one must provision adequate bandwidth and/or keep the distribution networks simple and
with as few hops as possible. IP QoS (diffserv), the Real-Time Transport Protocol (RTP),
and 802.1p at layer 2 are often utilized to manage QoS. [To minimize in-packet bit
corruption, Forward Error Correction (FEC) mechanisms may be used—a state-of-the-
art mechanism can improve Bit Error Rates (BERs) by an impressive four or five orders
of magnitude.]
Multicast Routing—A multicast network requires a mechanism to build distribu-
tion trees that define a unique forwarding path between the subnet of the content source
and each subnet containing members of the multicast group, specifically, receivers. A
principle utilized in the construction of distribution trees is to guarantee that at most
one copy of each packet is forwarded on each branch of the tree. This is implemented by
ascertaining that there is sufficient real-time topological information at the multicast
router of the source host for constructing a spanning tree rooted at said multicast router
(or other appropriate router) and providing connectivity to the local multicast routers of
each receiving host. A multicast router forwards multicast packets to two types of
devices: downstream-dependent routers and receivers (hosts) that are members of a
particular multicast group. See Table 1.1 for a list of some key multicast-related
protocols.
Multicast routing protocols belong to one of two categories: Dense-Mode (DM)
protocols and Sparse-Mode (SM) protocols.
BASIC MULTICAST PROTOCOLS AND CONCEPTS 7
For IP multicast, there are several multicast routing protocols that can be employed
to acquire real-time topological and membership information for active groups. Routing
protocols that may be utilized include the PIM, the DVMRP, the MOSPF, and CBTs.
Multicast routing protocols build distribution trees by examining the routing forwarding
8 INTRODUCTION TO IP MULTICAST
table that contains unicast reachability information. PIM and CBT use the unicast
forwarding table of the router. Other protocols use their specific unicast reachability
routing tables; for example, DVMRP uses its distance vector routing protocol to
determine how to create source-based distribution trees, whereas MOSPF utilizes its
link-state table to create source-based distribution trees. MOSPF, DVMRP, and PIM DM
are DM routing protocols, whereas CBT and PIM SM are SM routing protocols. PIM is
currently the most widely used protocol.
Specifically, PIM Version 2 (PIMv2) is a protocol that provides intradomain
multicast forwarding for all underlying unicast routing protocols [e.g., Open Shortest
Path First (OSPF) or BGP], independent from the intrinsic unicast protocol. Two modes
exist: PIM SM and PIM DM.5
PIM DM (defined in RFC 3973, January 2005) is a multicast routing protocol that
uses the underlying unicast routing information base to flood multicast datagrams to all
multicast routers. Prune messages are used to prevent future messages from propagat-
ing to routers without group membership information [RFC3973]. PIM DM attempts to
send multicast data to all potential receivers (flooding) and relies upon their self--
pruning (removal from the group) to achieve distribution. In PIM DM, multicast traffic
is initially flooded to all segments of the network. Routers that have no downstream
neighbors or directly connected receivers prune back the unwanted traffic. PIM DM
assumes most receivers (hosts, PCs, TV viewers, cellular phone handsets) wish to
receive the multicast; therefore the protocol forwards the multicast datagrams
everywhere, and then routers prune the distribution tree where it is not needed. PIM
is now being utilized for IPTV applications; typically DM is used in the backbone;
however, SM could also be utilized in some applications or portions of the overall
network.
In SM PIM, only network segments with active receivers that have explicitly
requested multicast data are forwarded the traffic. PIM SM relies on an explicit joining
request before attempting to send multicast data to receivers of a multicast group. In a
PIM SM network, sources must send their traffic to a Rendezvous Point (RP); this traffic
is in turn forwarded to receivers on a shared distribution tree. SM works by routers
sending PIM Join messages to start the multicast feed being sent across links. The
assumption in SM is that relatively few users need the multicast information and
therefore PIM SM starts with no flooding of multicast. In short order, router-to-router
PIM Join messages cause the multicast stream to be forwarded across links to where it is
needed. This is the current standard for Internet Service Providers (ISPs) supporting
Internet multicast [WEL200101].
An RP (described in RFC 2362) acts as the meeting place for sources and receivers of
multicast data. It is required only in networks running PIM SM and is needed only to start
new sessions with sources and receivers. In a PIM SM network, sources send their traffic
to the RP; this traffic is in turn forwarded to receivers downstream on a shared distribution
5
PIM bidirectional (PIM bidir) (a variant of PIM) allows data flow both up and down the same distribution
tree. PIM bidir uses only shared tree forwarding, thereby reducing the creation of “state” information.
BASIC MULTICAST PROTOCOLS AND CONCEPTS 9
tree. A Designated Router (DR) is the router on a subnet that is selected to control
multicast routes for the members on its directly attached subnet. The receiver sends an
IGMP Join message (see below) to this designated multicast router.6 IP multicast traffic
transmitted from the multicast source is distributed over the tree, via the designated
router, to the receivers subnet. When the designated router of the receiver learns about
the source, it sends a PIM Join message directly to the sources router, creating a
source-based distribution tree, from the source to the receiver. This source tree does not
include the RP unless the RP is located within the shortest path between the source and
receiver.
Auto-RP is a mechanism where a PIM router learns the set of group-to-RP
mappings required for PIM SM. Auto-RP automates the distribution of group-to-RP
mappings. To make auto-RP work, a router must be designated as an RP mapping
agent that receives the RP announcement messages from the RPs and arbitrates
conflicts. Bootstrap Router (BSR) is another mechanism with which a PIM router
learns the set of group-to-RP mappings required for PIM SM. BSR operates similarly
to Auto-RP: it uses candidate routers for the RP function and for relaying the RP
information for a group. RP information is distributed through BSR messages that are
carried within PIM messages. PIM messages are link-local multicast messages that
travel from PIM router to PIM router. Each method for configuring an RP has its
strengths, weaknesses, and complexity. Auto-RP is typically used in a conventional
IP multicast network given that it is straightforward to configure, well tested, and
stable.
IGMP (Versions 1, 2, and 3) is the protocol used by IP Version 4 (IPv4) hosts to
communicate multicast group membership states to multicast routers. IGMP is used to
dynamically register individual hosts/receivers on a particular local subnet (e.g., LAN)
to a multicast group. IGMPv1 defined the basic mechanism. It supports a Membership
Query (MQ) message and a Membership Report (MR) message. Most implementations
at press time employed IGMPv2; Version 2 adds Leave Group (LG) messages. Version
3 adds source awareness allowing the inclusion or exclusion of sources. IGMP allows
group membership lists to be dynamically maintained. The host (user) sends an IGMP
“report,” or join, to the router to be included in the group. Periodically, the router sends
a “query” to learn which hosts (users) are still part of a group. If a host wishes to
continue its group membership, it responds to the query with a “report.” If the host does
not send a “report,” the router prunes the group list to delete this host; this eliminates
unnecessary network transmissions. With IGMPv2, a host may send a “leave group”
message to alert the router that it is no longer participating in a multicast group; this
allows the router to prune the group list to delete this host before the next query is
scheduled, thereby minimizing the time period during which unneeded transmissions
are forwarded to the network.
6
This is different from the router-to-router PIM Join message just described; this message is from a receiver to
its gateway multicast router.
10 INTRODUCTION TO IP MULTICAST
Some of these protocols (but not all) are covered in the chapters.
Figure 1.2 illustrates where some of these protocols apply in the context of a typical
multicast network.
It should be noted that the design and turnup of IP multicast networks is fairly
complex. This is because by its very nature IP multicast traffic is “blasted all over the
map”; hence, a simple design mistake (or oversight) will push traffic to many interfaces
and easily flood and swamp router and switch interfaces.7
7
This statement is based on some 100-h weeks spent by the author configuring IPTV networks while
endeavoring to meet established business deadlines.
IPTV AND DVB-H APPLICATIONS 11
RP
PIM rendezvous
point
MPBGP
PIM - SM
Bidir PIM
PIM - SSM
MVPN
IGMP, snooping,
RGMP
Designated DR IGMP
router
DSLAM
IGMP
DSLAM
While IP multicast has been around for a number of years, it is now finding fertile
commercial applications in the IPTVand DVB-H arenas. Applications such as datacast-
ing (e.g., stock market or other financial data) tend to make use of large multihop
networks; pruning is often employed and nodal store-and-forward approaches are totally
acceptable. Applications such as video are very sensitive to end-to-end delay, jitter, and
(uncorrectable) packet loss; QoS considerations are critical. These networks tend to have
fewer hops, and pruning may be somewhat trivially implemented by making use of a
simplified network topology.
IPTV services enable advanced content viewing and navigation by consumers; the
technology is rapidly emerging and becoming commercially available. IPTV services
enable traditional carriers to deliver SD and HD video to their customers in support of
their triple/quadruple play strategies. With the significant erosion in revenues from
12 INTRODUCTION TO IP MULTICAST
T A B L E 1.2. Video Dialtone Applications by the Phone Companies According to the FCC First
Video Report, 1994
Date Telephone Company Location Homes Type of Proposal
10/21/92 Bell Atlantic-VA Arlington, VA 2,000 Technical/market
10/30/92 NYNEX New York, NY 2,500 Technical
11/16/92 New Jersey Bell Florham Park, NJ 11,700 Permanent
12/15/92 New Jersey Bell Dover Township, NJ 38,000 Permanent
04/27/93 SNET West Hartford, CT 1,600 Technical/market
06/18/93 Rochester Telephone Rochester, NY 350 Technical/market
06/22/93 US WEST Omaha, NE 60,000 Technical/market
12/15/93 SNET Hartford & Stamford, CT 150,000 Technical/market
12/16/93 Bell Atlantic MD & VA 300,000 Permanent
12/20/93 Pacific Bell Orange Co., CA 210,000 Permanent
So. San Francisco Bay, CA 490,000 Permanent
Los Angeles, CA 360,000 Permanent
San Diego, CA 250,000 Permanent
01/10/94 US West Denver, CO 330,000 Permanent
01/24/94 US West Portland, OR 132,000 Permanent
Minneapolis/St. Paul, MN 292,000 Permanent
01/31/94 Ameritech Detroit, MI 232,000 Permanent
Columbus & Cleveland, OH 262,000 Permanent
Indianapolis, IN 115,000 Permanent
Chicago, IL 501,000 Permanent
Milwaukee, WI 146,000 Permanent
03/16/94 US West Boise, ID 90,000 Permanent
Salt Lake City, UT 160,000 Permanent
04/13/94 Puerto Rico Tel. Co. Puerto Rico 250 Technical
05/23/94 GTE - Contel of Va. Manassas, VA 109,000 Permanent
GTE Florida Inc. Pinella and Pasco Co., FL 476,000 Permanent
GTE California Inc. Ventura Co., CA 122,000 Permanent
GTE Hawaiian Tel. Honolulu, HW 334,000 Permanent
06/16/94 Bell Atlantic Wash. DC LATA 1,200,000 Permanent
Baltimore, MD; northern NJ; 2,000,000 Permanent
DE; Philadelphia, PA;
Pittsburgh, PA; and S.E.VA
06/27/94 BellSouth Chamblee & DeKalb, GA 12,000 Technical/market
07/08/94 NYNEX RI 63,000 Permanent
MA 334,000 Permanent
09/09/94 Carolina Tel. & Tel. Wake Forest, NC 1,000 Technical/market
04/28/95 SNET CT 1,000,000 Permanent
. Content aggregation
. Content encoding [e.g., Advanced Video Coding (AVC)/H.264/MPEG-4 Part 10,
MPEG-2, SD, HD, Serial Digital Interface (SDI), Asynchronous Serial Interface
(ASI), layer 1 switching/routing]
. Audio management
14 INTRODUCTION TO IP MULTICAST
T A B L E 1.3. Typical DSL Technologies That May be Used in Current-Day IPTV While Waiting
for FTTH
DSL A technology that exploits unused frequencies on copper telephone lines to
transmit traffic typically at multimegabit speeds. DSL can allow
voice and high-speed data to be sent simultaneously over the same line.
Because the service is “always available,” end users do not need to dial in or
wait for call setup. Variations include ADSL, G.lite ADSL (or simply G.lite),
VDSL [International Telecommunications Union (ITU) G.993.1], and VDSL2
(ITU G.993.2). The standard forms of ADSL [ITU G.992.3 and G.992.5 and
American National Standards Institute (ANSI) T1.413-Issue 2] are all built
upon the same technical foundation, Discrete Multitone (DMT). The suite
of ADSL standards facilitates interoperability between all standard forms
of ADSL.
ADSL (Full- Access technology that offers differing upload and download speeds
Rate and can be configured to deliver up to 6 mbps (6000 kbps) from the
Asymmetric network to the customer. ADSL enables voice and high-speed data to
DSL) be sent simultaneously over the existing telephone line. This type of
DSL is the most predominant in commercial use for business and
residential customers around the world. Good for general Internet access
and for applications where downstream speed is most important, such as
video on demand. ITU-T recommendation G.992.1 and ANSI standard
T1.413-1998 specify full-rate ADSL. ITU recommendation G.992.3
specifies ADSL2, which provides advanced diagnostics, power saving
functions, PSD shaping, and better performance than G.992.1. ITU
recommendation G.992.5 specifies ADSL2Plus, which provides the
benefits of ADSL2Plus twice the bandwidth so that bit rates as high as
20 Mbps downstream can be achieved on relatively short lines.
G.lite ADSL A standard that was specifically developed to meet the plug-and-play
(or simply requirements of the consumer market segment. G.lite is a medium-
G.lite) bandwidth version of ADSL that allows Internet access at up to
1.5 Mbps downstream and up to 500 kbps upstream. G.lite is an
ITU standard (ITU G.992.2). G.lite has seen comparatively little use,
but it did introduce the valuable concept of splitterless installation.
RADSL (Rate A nonstandard version of ADSL. Note that standard ADSL also permits the
Adaptive DSL) ADSL modem to adapt speeds of data transfer.
VDSL A standard for up to 26 Mbps over distances up to 50 m on short loops
such as from fiber to the curb. In most cases, VDSL lines are served
from neighborhood cabinets that link to a central office via optical fiber.
It is useful for “campus” environments—universities and business parks,
for example. VDSL is currently being introduced in market trials to deliver
video services over existing phone lines. VDSL can also be configured in
symmetric mode.
VDSL2 ITU recommendation G.993.2 specifies eight profiles that address a
(Second- range of applications including up to 100-Mbps symmetric transmission on
Generation loops about 100 m long (using a bandwidth of 30 MHz), symmetric bit rates
VDSL) in the 10–30-Mbps range on intermediate-length loops (using a bandwidth
IPTV AND DVB-H APPLICATIONS 15
T A B L E 1.3. (Continued)
of 12 MHz), and asymmetric operation with downstream rates in the range
of 10–40 Mbps on loops of lengths ranging from 3 km to 1 km (using a
bandwidth of 8.5 MHz). VDSL2 includes most of the advanced feature
from ADSL2. The rate/reach performance of VDSL2 is better
than VDSL.
Symmetric Symmetric variations of DSL that include SDSL, SHDSL, HDSL,
flavors DSL HDSL2, and IDSL. The equal speeds make symmetric DSLs useful
for LAN access, video conferencing, and locations hosting
Web sites.
SDSL A vendor-proprietary version of symmetric DSL that may include bit rates
(Symmetric to and from the customer ranging from 128 kbps to 2.32 Mbps. SDSL is an
DSL) umbrella term for a number of supplier-specific implementations over a
single copper pair providing variable rates of symmetric service. SDSL
uses 2 Binary, 1 Quaternary (2B1Q).
SHDSL A state-of-the-art, industry standard symmetric DSL, SHDSL equipment
conforms to ITU recommendation G.991.2, also known as G.shdsl,
approved by the ITU-T in 2001. SHDSL achieves 20% better loop
reach than older versions of symmetric DSL and it causes much
less cross talk into other transmission systems in the same cable.
SHDSL systems may operate at many bit rates, from 192 kbps
to 5.7 Mbps, thereby maximizing the bit rate for each customer.
G.shdsl specifies operation via one pair of wires, or for operation on
longer loops, two pairs of wire may be used. For example, with two
pairs of wire, 1.2 Mbps can be sent over 20,000 ft of American Wire
Gage (AWG) 26 wire. SHDSL is best suited to data-only applications that
need high upstream bit rates. Though SHDSL does not carry voice like
ADSL, new voice-over-DSL techniques may be used to convey digitized
voice and data via SHDSL. SHDSL is being deployed primarily for business
customers.
HDSL (High- A DSL variety created in the late 1980s that delivers symmetric service at
Data-Rate speeds up to 2.3 Mbps in both directions. Available at 1.5 or 2.3 Mbps, this
DSL) symmetric fixed-rate application does not provide standard telephone
service over the same line and is already standardized through the ETSI
(European Telecommunications Standards Institute) and ITU. Seen as an
economical replacement for T1 or E1, it uses one, two, or three twisted
copper pairs.
HDSL2 A variant of DSL that delivers 1.5-Mbps service each way, supporting voice,
(Second- data, and video using either ATM, private-line service, or frame relay over a
Generation single copper pair. This ATIS standard (T1.418) supports a fixed 1.5-Mbps
HDSL) rate both up and downstream. HDSL2 does not provide standard voice
telephone service on the same wire pair. HSDL2 differs from HDSL in
that HDSL2 uses one pair of wires to convey 1.5 Mbps whereas ANSI HDSL
uses two wire pairs.
(continued)
16 INTRODUCTION TO IP MULTICAST
T A B L E 1.3. (Continued)
HDSL4 A high-data-rate DSL that is virtually the same as HDSL2 except it achieves
about 30% greater distance than HDSL or HDSL2 by using two pairs of
wire (thus, four conductors), whereas HDSL2 uses one pair of wires.
IDSL A form of DSL that supports symmetric data rates of up to 144 kbps using
(Integrated existing phone lines. Has the ability to deliver services through a DLC
Services Digital (Digital Loop Carrier: a remote device often placed in newer neighborhoods
Network DSL) to simplify the distribution of cable and wiring from the phone company).
While DLCs provide a means of simplifying the delivery of traditional
voice services to newer neighborhoods, they also provide a unique
challenge in delivering DSL into those same neighborhoods. IDSL
addresses this market along with ADSL and G.lite as they are implemented
directly into those DLCs. IDSL differs from its relative ISDN (Integrated
Services Digital Network) in that it is an “always-available” service, but
capable of using the same terminal adapter, or modem, as for ISDN.
Courtesy: DSL Forum.
Aggregation
switch
Encryptor IP encapsulator
(conditional (MPEG-2 transport
MPEG-4 encoder access system) stream segmentation
and reassembly)
Content source
DR Designated router
Content distribution
IGMP
DSLAM or optical
distribution node
IGMP
Aggregation
switch
Encryptor IP encapsulator
(conditional (MPEG-2 transport
MPEG-4 encoder access system) stream segmentation
and reassembly)
Aggregation TELCO 1
switch
IP receivers
Switch
Encryptor
(conditional
MPEG-4 encoder access system)
L-band Switch
network
PIM DM
Content Source
Content PIM DM
distribution
DSLAM DSLAM
or optical or optical
distribution distribution
node node
IGMP
IGMP
TELCO 2 IP receivers
L-band Switch
network
PIM DM
DSLAM
DSLAM or optical
or optical distribution
distribution node
node
IGMP
IGMP
Audio
. MPEG-1, layer II
. Dolby AC3
. AAC-HE (Advanced Audio Coding—High Efficiency)
The DVB Project (see below) has developed specifications for digital television
systems which are turned into standards by international bodies such as ETSI and
CENELEC (Comite Europeen de Normalisation Electrotechnique—European Commit-
tee for Electrotechnical Standardization). For Digital Rights Management (DRM) it
developed, DVB-CA defines a DVB-CSA and a Common Interface (DVB-CI) for
accessing scrambled content:
This topic will be reexamined in Chapter 11. Next, we briefly discuss DVB-H
applications.
DVB-H is a technical development activity by the DVB Project Office organization
[DVB200701] targeting handheld, battery-powered devices such as mobile telephones,
PDAs, and so on. It addresses the requirements for reliable, high-speed, high-data-rate
reception for a number of mobile applications, including real-time video to handheld
devices. DVB-H systems typically make use of IP multicast. DVB-H is generating
significant interest in the broadcast and telecommunications worlds, and DVB-H
services are expected to start at this time. Industry proponents expect to see 300 million
DVB-H-capable handsets to be deployed by 2009. The DVB-H protocols are being
standardized through ETSI.
Digital Video Broadcasting (DVB) is a consortium of over 300 companies in the
fields of broadcasting and manufacturing that work cooperatively to establish common
international standards for digital broadcasting. DVB-generated standards have become
the leading international standards, commonly referred to as “DVB,” and the accepted
choice for technologies that enable an efficient, cost-effective, higher quality, and
interoperable digital broadcasting. The DVB standards for digital television have been
adopted in the United Kingdom, across mainland Europe, in the Middle East, in South
America, and in Australasia.
20 INTRODUCTION TO IP MULTICAST
IP Multicast
transmissions can be even more robust. This is advantageous when considering the hostile
environments and poor (but fashionable) antenna designs typical of handheld receivers.
Like DVB-T, DVB-H can be used in 6-, 7-, and 8-MHz channel environments.
However, a 5-MHz option is also specified for use in nonbroadcast environments. A key
initial requirement, and a significant feature of DVB-H, is that it can coexist with DVB-T
in the same multiplex. Thus, an operator can choose to have two DVB-T services and one
DVB-H service in the same overall DVB-T multiplex.
Broadcasting is an efficient way of reaching many users with a single (configurable)
service. DVB-H combines broadcasting with a set of measures to ensure that the target
receivers can operate from a battery and on the move and is thus an ideal companion to 3G
telecommunications, offering symmetric and asymmetric bidirectional multimedia services.
DVB-H trials have taken place in recent years in Germany, Finland, and the United
States (Las Vegas). Such trials help frequency planning and improve understanding of the
complex issue of interoperability with telecommunications networks and services.
This topic will be reexamined in Chapter 12.
The following are the key RFCs that define multicast operation:
. RFC 988, Host Extensions for IP Multicasting, S. E. Deering, July 1986.
(obsoletes RFC 966) (obsoleted by RFC 1054, RFC 1112).
. RFC 1054, Host Extensions for IP Multicasting, S. E. Deering, May 1988
(obsoletes RFC 0988) (obsoleted by RFC 1112).
. RFC 1075, Distance Vector Multicast Routing Protocol, D. Waitzman, C.
Partridge, S. Deering, November 1988.
. RFC 1112, Host Extensions for IP Multicasting, S. E. Deering, August 1989
(obsoletes RFC 0988, RFC1054) (Updated by RFC 2236) (also STD0005) (status:
standard).
. RFC 1469, IP Multicast over Token-Ring Local Area Networks, T. Pusateri, June
1993 (status: historic).
. RFC 1584, Multicast Extensions to OSPF, J. Moy, March 1994.
22 INTRODUCTION TO IP MULTICAST
The following is a (partial) listing of textbooks on the topic. Most were written several
years ago.
REFERENCES
[MIN200001] D. Minoli, Digital Video Technologies, video section in K. Terplan and P. Morreale,
Editors. The Telecommunications Handbook, IEEE Press, 2000.
[MIN200301] D. Minoli, Telecommunications Technology Handbook, 2nd ed., Artech House,
Norwood, MA, 2003.
[MIN200401] D. Minoli, A Networking Approach to Grid Computing, Wiley, New York, 2006.
[RFC988] RFC 988, Host Extensions for IP Multicasting, S. E. Deering, July 1986.
[RFC1054] RFC 1054, Host Extensions for IP Multicasting, S. E. Deering, May 1988.
[RFC1112] RFC 1112, Host Extensions for IP Multicasting, S. E. Deering, August 1989.
[RFC2201] RFC 2201, Core Based Trees (CBT) Multicast Routing Architecture, A. Ballardie,
September 1997.
[RFC3973] RFC 3973, Protocol Independent Multicast–Dense-Mode (PIM–DM): Protocol
Specification (Revised), A. Adams, J. Nicholas, W. Siadak, January 2005.
[WEL200101] P. J. Welcher, The Protocols of IP Multicast, White Paper, Chesapeake
NetCraftsmen, Arnold, MD.
2
MULTICAST ADDRESSING
FOR PAYLOAD
Multicast communication is predicated on the need to send the same content to multiple
destinations simultaneously. The group or groups of recipients are generally dynamic in
nature and the join and dejoin (leave) rate may be high; furthermore, the implementation
time for a given join/dejoin action is expected to occur in 1–2 s (consider the typical
example of viewers changing TV channels on their remote control device).
Underpinning this ability to sustain multipoint communication is the addressing
scheme. There is a desire in multicast environments to have distributed control of the user
groups. The implication is that the source should not have to know and specifically
address each intended recipient individually, thereby having to maintain large central
tables of current users. Therefore, it follows that a mechanism needs to be available to
accomplish this distribution in an efficient manner. This is accomplished via the use of
multicast IP addresses and the multicast Media Access Control MAC addresses. This
topic is discussed in this chapter.
Multicast addresses define, in effect, the group of hosts that participate in the shared
reception of the content intended for that group. One can think of this by analogy with a
26
IP MULTICAST ADDRESSES 27
local TV station or local radio station. When a user “tunes” the TV to, say, Channel 7
(WABC TV in New York City), the user joins the set of viewers (receivers) that receive
the content produced and distributed by WABC TV. When a user then changes channel
and “tunes” the TV to, say, Channel 4 (WNBC TV in New York City), the user joins the
set of viewers (receivers) that receive the content produced and distributed by WNBC TV.
In IP multicast the analogous activity is accomplished by using IP multicast addresses.
The various content providers stream IP packets that have their own source address and a
multicast address as the destination address. For example, WABC TV in New York City
could generate programming with the address 239.10.10.1, WNBC TV in New York City
could generate programming with the address 239.10.10.2, and so on.
RFC 1112 specifies the extensions required of a host implementation of IP to support
multicasting. The IANA controls the assignment of IP multicast addresses. IANA has
allocated what has been known as the Class D address space to be utilized for IP
multicast. IP multicast group addresses are in the range 224.0.0.0–239.255.255.255. See
Figure 2.1. For each multicast address, there exists a set of zero or more hosts (receivers)
that look for packets transmitted to that address. This set of devices is called a host group.
A source (host) that sends packets to a specific group does not need to be a member of the
group and the host typically does not even know the current members in the group
[PAR200601]. As noted above, the source address for multicast IP packets is always the
unicast source address.
There are two types of host groups [PAR200601]:
. Permanent host groups: Applications that are part of this type of group have an IP
address permanently assigned by the IANA. A permanent group continues to exist
even if it has no members. Membership in this type of host group is not permanent:
a host (receiver) can join or leave the group as desired. An application can use DNS
to obtain the IP address assigned to a permanent host group using the domain
Class D IP addresses
1110 group ID
Administrative categories:
(1) “well-known” multicast addresses (assigned by IANA)
(2) “transient” multicast addresses, assigned and reclaimed dynamically
by organization
mcast.net. The application can determine the permanent group from an address by
using a pointer query in the domain 224.in-addr.arpa.
. Transient host groups: Any group that is not permanent as just described is
by definition transient. The group is available for dynamic assignment as needed.
Transient groups cease to exist when the number of members drops to zero.
As described above, some IP multicast addresses have been reserved for specific
functions. Addresses in the range 224.0.0.0–224.0.0.255 are reserved to be used by
network protocols on a local network segment. Network protocols make use of these
addresses for automatic router discovery and to communicate routing information (e.g.,
OSPF uses 224.0.0.5 and 224.0.0.6 to exchange link-state information). IP packets with
these addresses are not forwarded by a router; they remain local on a particular LAN
segment [they have a Time-to-Live (TTL) parameter set to 1; even if the TTL is different
from 1, they still are not forwarded by the router]. These addresses are also known as link-
local addresses.
The statically assigned link-local scope is 224.0.0.0/24. The list of IP addresses
assigned to permanent host groups is included in RFC 3232. From November 1977
through October 1994, the IANA periodically published tables of the IP parameter
assignments in RFCs entitled, “Assigned Numbers.” The most current of these assigned
number RFCs had standard status and carried the designation: STD 2, RFC 1700. At
this time, RFC 1700 has been obsoleted by RFC 3232. Since 1994, this sequence of
RFCs has been replaced by an online database accessible through the IANA Web page
(www.iana.org).
Some well known link-local addresses are the following:
Some of these globally scoped addresses have been assigned recently, for example:
RFC 2365 defines two limited scopes of interest: the IPv4 Local Scope and IPv4
Organization Local Scope.
The IPv4 Local Scope—239.255.0.0/16 239.255.0.0/16 is defined to be the IPv4
Local Scope. The local scope is the minimal enclosing scope and hence is not further
divisible. Although the exact extent of a local scope is site dependent, locally scoped
regions must obey certain topological constraints. In particular, a local scope must not
span any other scope boundary. Further, a local scope must be completely contained
within or equal to any larger scope. In the event that scope regions overlap in area, the area
of overlap must be in its own local scope. This implies that any scope boundary is also a
boundary for the local scope. The IPv4 Local Scope space grows “downward.” As such,
the IPv4 Local Scope may grow downward from 239.255.0.0/16 into the reserved ranges
239.254.0.0/16 and 239.253.0.0/16. However, these ranges should not be utilized until
the 239.255.0.0/16 space is no longer sufficient.
The IPv4 organization Local Scope—239.192.0.0/14 239.192.0.0/14 is defined to
be the IPv4 organization Local Scope and is the space from which an organization
should allocate subranges when defining scopes for private use. The ranges 239.0.0.0/
10, 239.64.0.0/10, and 239.128.0.0/10 are unassigned and available for expansion of
this space. These ranges should be left unassigned until the 239.192.0.0/14 space is no
longer sufficient. This is to allow for the possibility that future revisions may define
additional scopes on a scale larger than organizations.
Class 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
. frames destined for that card as defined by the destination MAC address and the
burned-in MAC address on the NIC or
. the broadcast MAC address (the broadcast address is 0xFFFF.FFFF.FFFF).
It follows that some mechanism has to be defined so that multiple hosts could receive the
same packet and still be able to differentiate between multicast groups. In the IEEE 802.3
standard, bit 0 of the first octet is used to indicate a broadcast and/or multicast frame, as
shown in Figure 2.3. This broadcast/multicast bit denotes that the frame is intended for an
arbitrary group of hosts or all hosts on the network. IP multicast utilizes this capability to
transmit IP packets to a group of hosts on a LAN segment.
32
MPEG-LAYER ADDRESSES 33
Broadcast/Multicast bit
IP Multicast Addr
32 Bits
28 Bit
Multicast address
1110
5 Bits
dropped
25–Bit 23 Bits
Prefix
48 Bits
MAC address
In the video arena there is an additional “address” (channel indicator) that is often used.
This is the Packet Identifier (PID1). The topic is introduced briefly here and revisited in
Chapter 11. Operators typically maintain tables that show both the multicast IP address
1
Some also call this the Program ID.
34 MULTICAST ADDRESSING FOR PAYLOAD
Datalink Header Logical Link Control SNAP Data and CRC (FCS)
and the PID as a way to manage the content delivery. The video portion of the IPTV
system is defined by key documents such as (but not limited to):
The MPEG-2 and/or MPEG-4 standard defines three layers: audio, video, and
systems. The systems layer supports synchronization and interleaving of multiple
compressed streams, buffer initialization and management, and time identification. The
audio and the video layers define the syntax and semantics of the corresponding
Elementary Streams (ESs). An ES is the output of an MPEG encoder and typically
contains compressed digital video, compressed digital audio, digital data, and digital
control data. The information corresponds to an access unit (a fundamental unit of
encoding), such as a video frame. Each ES is in turn an input to an MPEG-2 processor that
accumulates the data into a stream of Packetized Elementary Stream (PES) packets. A
PES typically contains an integral number of ESs. See Figure 2.7, which shows both the
multiplex structure and the Protocol Data Unit (PDU) format [MIN200801].
As seen in the figure, PESs are then mapped to Transport Stream (TS) unit(s). Each
MPEG-2 TS packet carries 184 octets of payload data prefixed by a four-octet (32-bit)
header (the resulting 188-byte packet size was originally chosen for compatibility with
ATM systems). These packets are the basic unit of data in a TS. They consist of a sync byte
(0 · 47), followed by flags and a 13-bit PID. This is followed by other (some optional)
transport fields; the rest of the packet consists of the payload. DVB specifications for
transmission add 16 bytes of Reed–Solomon forward error correction to create a packet
that is 204 bytes long. See Figure 2.8.
As noted, the PID is a 13-bit field that is used to uniquely identify the stream to which
the packet belongs (e.g., PES packets corresponding to an ES) generated by the
multiplexer. The PID allows the receiver to differentiate the stream to which each
received packet belongs; effectively, it allows the receiver to accept or reject PES packets
MPEG-LAYER ADDRESSES 35
ENCODER
Serial PES Encoder
Digital Interface Video
Stream ID1 SPTS (Single Program output
(SD) Video Transport Stream) UDP/IP
PES MPEG-2 encapsulation
Audio
Stream ID2 Max
4 184 bytes
PES 4 164 bytes
Data
4 184 bytes
MPEG–2 TS
4 bytes 184 bytes
PES Header Payload
(32 bits)
24 8 16
at a high level without burdening the receiver with extensive processing. Often one sends
only one PES (or a part of single PES) in a TS packet (in some cases, however, a given
PES packet may span several TS packets so that the majority of TS packets contain
continuation data in their payloads). MPEG TS are typically encapsulated in the User
Datagram Protocol (UDP) and then in IP.
Note: traditional approaches make use of the PID to identify content; in IPTV
applications, the IP multicast address is used to identify the content. Also, the latest IPTV
systems make use of MPEG-4-coded PESs.
Ultimately, an IPTV stream consists of packets of fixed size, each of which carries a
stream-identifying number called a PID. These MPEG packets are aggregated into an IP
packet; the IP packet is transmitted using IP multicast methods. Each PID contains specific
video, audio, or data information. To display a channel of IPTV digital television, the
DVB packet
204 bytes
188 bytes 16 bytes
MPEG-2 packet Reed–Solomon
error correction
Uplink Receiver
MPE
DVB-based application configures the driver in the receiver to pass up to it the packets with
a set of specific PIDs, for example, PID 121 containing video and PID 131 containing audio
(these packets are then sent to the MPEG decoder, either hardware or software based). So,
in conclusion, a receiver or demultiplexer extracts elementary streams from the TS in part
by looking for packets identified by the same PID.
Programs are groups of one or more PID streams that are related to each other. For
example, a TS used in IPTV could contain five programs to represent five video channels.
Assume that each channel consists of one video stream, one or two audio streams, and
metadata. A receiver wishing to tune to a particular “channel” has to decode the payload
of the PIDs associated with its program. It can discard the contents of all other PIDs.
DVB embodies the concept of “virtual channels” in analogous fashion as ATM.
Virtual channels are identified by PIDs and are also colloquially known as “PIDs.”
DVB packets are transmitted over the satellite network (one can think of the DVB
packets as being similar to an ATM cell, but with different length and format). The
receiver looks for specific PIDs that have been configured to acquire and then inject
into the telco network.
For satellite transmission, and to remain consistent with already existing MPEG-2
technology,2 TSs are further encapsulated in Multiprotocol Encapsulation (MPE—RFC
3016) and then segmented again and placed into TSs via a device called IP Encapsulator
(IPE); see Figure 2.9. MPE is used to transmit datagrams that exceed the length of the
DVB “cell,” just like ATM Adaptation Layer 5 (AAL5) is used for a similar function in an
ATM context. MPE allows one to encapsulate IP packets into MPEG-2 TSs (“packets,” or
“cells”). See Figure 2.10.
2
Existing receivers [specifically, Integrated Receiver Decoders (IRDs)] are based on hardware that works by
de-enveloping MPEG-2 TSs; hence, the MPEG-4-encoded PESs are mapped to TSs at the source.
MPEG-LAYER ADDRESSES 37
MPEG2 Transport
4 bytes
MPE
17 bytes
Ethernet
14 bytes
IP
20 bytes
IP Multicast UDP 7 TSs – same PID
8 bytes
Output of encoders
TS TS TS TS TS TS TS
MPEG2 Transport
4 bytes
V A D
184 bytes
Encoder 1
Terrestrial network
13 28 4
byte bytes 139 byte payload byte
MPE UDP/ID (content of the “virtual channel” (aka PID)) CRC
MPE IP payload
header
28 bytes of UDP/IP header or
40 bytes of TCP/IP header
REFERENCES
39
40 MULTICAST PAYLOAD FORWARDING
The transmission process at the source builds a datagram stream with a specified
destination IP multicast address, as defined in Chapter 2. The source network driver
encapsulates the datagram with an Ethernet frame that includes the source Ethernet
address and a(n appropriate) destination Ethernet address. The process then sends the
packet to the destination. In unicast IP traffic forwarding, the encapsulation of the IP
multicast packet into an Ethernet frame with the MAC address of the receiving device
makes use of the Address Resolution Protocol (ARP). As discussed in the previous
chapter, a static mapping has been defined to “create” the destination MAC address (a
multicast Ethernet address.) As noted in Chapter 2, in an Ethernet network, multi-
casting is supported by setting the high-order octet of the data link address to 0 · 01.
The 32-bit multicast IP address is mapped to an Ethernet address by placing the
low-order 23 bits of the Class D address into the low-order 23 bits of the IANA-reserved
address block. Because the high-order 5 bits of the IP multicast group are not utilized,
32 different IP-level multicast groups are mapped to the same Ethernet address.
Figure 3.1 illustrates one example. The destination process informs its network device
drivers that it wishes to receive datagrams destined for a given IP multicast address. The
device driver enables reception of packets for that IP multicast address.
Because of the nonunique IP-to-Ethernet address mapping, filtering by the device
driver is required. This is accomplished by checking the destination address in the IP
header before passing the packet to the IP layer. This ensures the receiving process does
not receive unneeded datagrams. Hosts that are not participating in a host group are not
listening for the multicast address; here, multicast packets are filtered by lower layer
network interface hardware [PAR200601].
Multicast routing protocols have been developed to transmit packets across a routed
network while at the same time seeking to avoid routing loops. There are two functions
required to support multicast transmission across a routed network:
Multicast Source
address Host
used by A
source A0-0B-07-C1-32-11 192 . 1 . 1 . 1
224.11.8.6 Actual MAC address Actual source IP address
(E0-0B-08-06) multicast datagram
A0-0B-07-C1-32-11 01-00-5E-0B-08-06 192 . 1 . 1 . 1 224.11.8.6 ...
SA Multicast DA Source IP Multicast IP
C2-1B-32-11-07-07 192 . 1 . 1. 2
Actual MAC address actual IP address
This mechanism allows a host to locate the nearest user/host seeking a specific
multicast address. The source host sends out a datagram with a TTL value of 1 (same
subnet) and waits for a reply. If no reply is received, the source host resends the datagram
with a TTL value of 2. If no reply is received, the host continues to increment the TTL
value until the nearest user is found.
Multicast-capable routers create logical distribution trees that control the path that IP
multicast traffic takes through the network in order to deliver traffic to all receivers.
Mechanisms exist for creating and for maintaining (e.g., pruning) the distribution trees.
Different multicast algorithms (e.g., PIM, CBT, DVMRP) use different techniques for
establishing the distribution tree. One can classify algorithms into source-based tree
algorithms and shared-tree algorithms. Different algorithms have different scaling
characteristics, and the characteristics of the resulting trees differ too, for example,
from an end-to-end delay perspective.
42 MULTICAST PAYLOAD FORWARDING
Members of multicast groups can join or leave (dejoin) at any point in time,
therefore, the distribution trees must be dynamically updated. When all the active
receivers on a particular branch stop requesting the traffic for a particular multicast
group, the multicast-capable routers prune that branch from the distribution tree and stop
forwarding traffic along that branch. If a receiver on that branch becomes active and
requests the multicast traffic, the multicast-capable router will dynamically modify the
distribution tree and start forwarding traffic again [CIS200701]. Note that IP multicast
does not require senders to be group members of that group.
The two basic types of multicast distribution trees are source trees [also known as
Shortest Path Trees (SPTs) or source-based trees] and shared trees (also known as share-
based trees). Messages are replicated only where the tree branches. Both SPTs and shared
trees are loop-free topologies.
The simplest form of a multicast distribution tree is the source tree. A source tree has
its root at the multicast source and has branches forming a spanning tree over the network
to the receivers. The tree makes use of the shortest path through the network and so a
separate SPT may exist for each individual source sending to each group. The notation of
(S,G) is utilized to describe an SPT where S is the IP address of the source and G is the
multicast group address. Figure 3.2 depicts an example of an SPT for group 239.1.1.1
rooted at the source, host A, and connecting three receivers, hosts B, C, and D. Using the
(S,G) notation, the SPT is (92.1.1.1, 239.1.1.1).
SPTs achieve, by definition, the optimal path topology between the source and
the receivers in terms of the number of hops, resulting in the minimum amount of
network latency for distributing multicast traffic. However, the multicast-capable
routers are required to maintain path information for each source. In large networks,
Multicast traffic
239.1.1.1
Source Host
A
92.1.1.1
92.2.2.4
Receiver
Host D
92.2.2.2 92.3.3.3
Receiver Receiver
Host B Host C
with either many sources and/or many groups, this state information can overtax
the routers, particularly for memory resources needed to store the multicast routing
table. Note that Deerings multicast algorithms [RFC988, RFC1054, RFC1112]
build source-rooted delivery trees, with one delivery tree per sender subnetwork
[RFC2201].
Figure 3.3 depicts a simple IPTV example; in this example the SPT is employed to
efficiently distribute video to remote users. Figure 3.4 shows the real-time pruning that
takes place.
Shared trees use a single common root placed at a selected point in the network.
This shared root is called a Rendezvous Point (RP) (also called core or center). Figure
3.5 shows a shared tree for the group 239.1.1.1 along with the shared root. When
making use of a shared tree, sources send their traffic to the root (RP) and then the traffic
is forwarded along the shared tree to reach all active receivers. In this example,
multicast traffic from sources 1 and 2 travels to the router at the shared root and then
along the shared tree to the three receivers, hosts B, C, and D. All sources in the
multicast group use the common shared tree. The notation (*, G) is used to represent the
tree. In this case, “*” is a wildcard that means all sources. The shared tree shown in
Figure 3.5 is written as (*, 239.1.1.1).
Shared trees require the minimum amount of state information in each router,
thereby minimizing the memory requirements for the routers and the mechanisms to
keep the state information up to date. However, the paths between the source and
receivers may not be the optimal paths in terms of hops and, consequently, latency. This
Video source/
server A
A A
A
A
Unicast
Router
A
Video source/
server A
A A
Multicast
Router
A
Video source/
server
Multicast
routing
protocol
Group
Join
Join
membership
protocol
Receivers
institutional network, carrier network, or Internet; with these applications there may be
“pockets” of denseness, but at the global level, wide-area groups tend to be sparsely
distributed [RFC2201].
A shared-tree architecture offers an improvement in scalability over source tree
architectures by a factor of the number of active sources. Source trees scale O(S · G),
since a distinct delivery tree is built per active source. Shared trees eliminate the source
S scaling factor; all sources use the same shared tree, therefore, a shared tree scales
O(G). The implication of this is that applications with many active senders, such as
distributed interactive simulation applications and distributed video gaming (where
most receivers are also senders), have significantly less impact on underlying multicast
routing if shared trees are used [RFC2201]. Notice that in IPTV the source is typically
unique because the source (call it a supersource) itself aggregates many content
channels (from various providers, e.g., 200 providers) into a single IP multicast stream
in order to apply a consistent coding scheme (e.g., convert from MPEG-2 to MPEG-4,
convert from analog to digital MPEG-4, convert from component video to MPEG-4
video) and to apply a consistent conditional access (digital rights management)
discipline.
For general applications (such as but not limited to datacasting and grid comput-
ing), shared trees incur significant bandwidth and state savings compared with source
trees. The first reason for this is that the tree only spans a groups receivers (including
46 MULTICAST PAYLOAD FORWARDING
Receiver
Receiver
Receiver
Receiver
Receiver
Receiver
Receiver Receiver
Receiver
The basic approach in unicast routing is forwarding information toward the receiver. In
multicast routing, the source needs to send traffic to a dynamic group of hosts, as
represented by a multicast group address. The basic approach in multicast routing is
forwarding multicast traffic away from the source; this approach is called Reverse Path
Forwarding (RPF).
In traditional unicast routing, traffic is relayed through the network along a single
path from the source to the destination. The router uses its routing table to determine how
to forward the traffic toward that destination, specifically to determine the outgoing
router link that a datagram should utilize. The table is indexed on destination
addresses since a unicast router is generally only concerned with the destination
address on a datagram that needs forwarding.1 After the interface is selected, the router
forward a single copy of the unicast datagram out that interface in the direction of the
destination.
RPF enables multicast-capable routers to correctly forward multicast traffic along
the distribution tree and avoid loops. A multicast-capable router must keep track of which
direction is toward the source (upstream) and which direction(s) is (are) toward the
receiver (downstream). A multicast-capable router only forward a multicast packet if it is
received on the upstream interface. When there are multiple downstream paths, the router
replicates the packet and forwards it along the appropriate downstream paths.
RPF employs the existing unicast routing table to determine the upstream and
downstream neighbors. When a multicast packet arrives at a router, the router will
perform an RPF check on the packet, namely, determine if it has been received on the
upstream interface. If the RPF check is successful, the packet is forwarded; otherwise the
packet is dropped.
For traffic flowing along a source tree, the RPF check procedure works as follows
[CIS200701]:
Step 1. Router looks up the source address in the unicast routing table to determine if
it has arrived on the interface that is on the reverse path back to the source.
Step 2. If the packet has arrived on the interface leading back to the source, the RPF
check is successful and the packet will be forwarded.
Step 3. If the RPF check in 2 fails, the packet is dropped.
Figure 3.7 shows examples of RPF checks. If a multicast packet from source
152.11.5.6 is received on interface Serial 0 (S0), a check of the multicast route table
shows that this packet should be dropped. If the multicast packet, on the contrary, arrived
on S1, then the RPF check passes and the packet is forwarded according to the interface
defined in the unicast routing table (could be S2, as an example.)
Note: the RPF algorithm builds a tree for each source in a multicast group.
1
A router is generally not concerned about the source address except in special cases.
48 MULTICAST PAYLOAD FORWARDING
The CBT algorithm defines another method to determine (near) optimum paths to
support multicast groups. (The disadvantage of the CBT algorithm is that it could in
theory build a suboptimal path for some sources and receivers.) This algorithm builds a
delivery tree for each multicast group, but the tree is identical for all sources. Each router
maintains a single tree for the entire group. The algorithm operates by making use of the
following steps [PAR200601, RFC2189, RFC2201]:
or layer 3 switches between senders and receivers must be IP multicast enabled; at the
very least, the ingress and egress routers to the backbone should be multicast routers. If
the intervening backbone routers lack support for IP multicast, IP tunneling (encapsu-
lating multicast packets within unicast packets) may be used as an interim measure to link
multicast routers. The choice of multicast routing protocol among PIM, DVMRP,
MOSPF, and CBT should be based on the characteristics of the multicast application
being deployed as well as the “density” and geographical location of receiving hosts
[CIS200701].
REFERENCES
[CIS200701] Cisco Systems, Internet Protocol (IP) Multicast Technology Overview, White Paper,
Cisco Systems, San Jose, CA.
[PAR200601] L. Parziale, W. Liu, et al., TCP/IP Tutorial and Technical Overview, IBM Press,
Redbook Abstract, IBM Form Number GG24-3376-07, 2006.
[RFC2189] RFC 2189, Core Based Trees (CBT Version 2) Multicast Routing—Protocol
Specification, A. Ballardie, September 1997.
[RFC2201] RFC 2201, Core Based Trees (CBT) Multicast Routing Architecture, A. Ballardie,
September 1997.
4
DYNAMIC HOST
REGISTRATION—INTERNET
GROUP MANAGEMENT
PROTOCOL
Q1 This chapter covers the important topic of dynamic host registration. In a multicast
environment, group membership information is exchanged between a receiver (host) and
the nearest multicast router. The IGMP is used by host receivers to join or leave a
multicast host group. Hosts establish group memberships by sending IGMP messages to
their local multicast router. Multicast-enabled routers monitor for IGMP messages to
maintain forwarding tables for the various interfaces on the router. Multicast-enabled
routers periodically send out queries to discover which groups are active or inactive on a
given subnet.
IGMP is used by IPv4-based receivers to report their IP multicast group member-
ships to neighboring multicast routers: it defines the signaling communication occurring
between receiving hosts and their local multicast router. IGMP2, now widely deployed, is
defined by RFC 2236 (November 1997).
The latest version at press time was Version 3, defined in RFC 3376 (IGMPv3,
October 2002); RFC 3376 subsumes and obsoletes RFC 2236 (IGMPv2). IGMPv3
supports receivers that explicitly signal sources from which they wish to receive traffic.
Specifically, IGMPv3 is employed by hosts to signal channel access in SSM. For SSM to
work, IGMPv3 must be available in last-hop routers and receiver host operating system
51
52 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
network stacks and be used by the applications running on those receiver hosts. Benefits
of SSM include:
In Version 1 (defined in RFC 1112), there are two types of IGMP messages: MQ and MR.
In a Version 1 implementation, receivers interested in joining a particular multicast group
generate MRs, which contain reference to that multicast address (group). The router, in
turn, builds a forwarding table entry and routinely forwards the multicast packets to the
interface(s) that support the subnet where registered hosts reside. The router periodically
sends out an IGMP MQ to verify that at least one host on the subnet is still interested in
receiving traffic directed to that group. When there is no reply to the three consecutive
IGMP MQs, the router times out the group and stops forwarding traffic directed toward
that group.
The basic IGMPv2 (defined in RFC 2236) message types are MQ, MR,1 and Leave
Group (LG). MQs can be generic (general MQ) or specific (group-specific MQ). In the
discussion below, the querier is the sender of a Query message—the querier is a multicast
router.
IGMPv2 works in a similar fashion as Version 1 but with the addition of the LG
message. With the LG mechanism, receivers/hosts can explicitly communicate their
intention to depart from the group to the local multicast router. The use of LGs reduces
the traffic on subnets, especially if there is a lot of join/change/leave activity. Upon
receiving an LG message the router issues a group-specific query to determine if there are
any remaining hosts on that subnet that are in need of receiving the traffic. If there are no
replies, the router times out the group and stops forwarding the traffic.
The IGMP messages for IGMPv2 are shown in Figure 4.1. The message comprises
an eight-octet structure. During transmission, IGMP messages are encapsulated in IP
datagrams; to indicate that an IGMP packet is being carried, the IP header contains a
protocol number of 2. For reference, Figure 4.2 shows the format of an IP datagram along
1
There are Versions 1, 2 MRs
IGMP MESSAGES 53
0 8 16 31
Type Max resp time
type of IGMP packet (used in membership
16-bit checksum
query messages)
Class D address
(Used in a report packet)
–0x11: Specifies a membership query packet. This packet is sent by a multicast router.
–0x12: Specifies an IGMPv1 membership report packet. This packet is sent by a
multicast host to signal participation in a specific multicast host group.
–0x16: Specifies an IGMPv2 membership report packet.
–0x17: Specifies a leave group packet. This packet is sent by a multicast host.
with a short list of protocol types, of which IGMP is one of many. An IGMPv2 PDU
consists of a 20-byte IP header and 8 bytes of IGMP.
The Type field identifies the type of IGMP packet.
The Max Response Time field is employed in membership query messages and it
specifies the maximum time a host can wait before sending a corresponding report. This
parameter allows routers to optimize the leave latency and the time between the last receiver
(host)leavesagroupandthetimetheroutingprotocolisnotifiedthattherearenomoreactive
members in that group. The maximum response time is measured in tenths of a second.
The Checksum field contains a 16-bit checksum for the message. The checksum
is the 16-bit ones complement of the ones complement sum of the whole IGMP
message (the entire IP payload). For computing the checksum, the Checksum field is
set to zero.
54 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Version IHL TOS Total length
Identification Flags Fragment offset
TTL Protocol Header checksum
Source IP address
Destination IP address
Options and padding
IHL, Internet Header Length. 4 bits. Specifies the length of the IP packet header in 32-bit words.
TOS, Type of Service. 8 bits. Specifies the parameters for the type of services requested.
Precedence. 3 bits.
Total length. 16 bits. Contains the length of the diagram.
Identification. 16 bits Used to identify the fragments of one datagram from those of another.
Flags. 3 bits.
Fragment offset. 13 bits. Used to direct the reassembly of a fragmented datagram.
TTL Time-to-Live. 8 bits. A timer field used to track the lifetime of the datagram.
When the TTL field is decremented down to zero, the datagram is discarded
Protocol. 8. bits. This field specifies the encapsulated protocol
Value Protocol
0 HOPOPT.IPv6 Hop-by-Hop Option.
1 ICMP.Internet Control Message Protocol
IGAP.IGMP for user Authentication Protocol
2 IGMP.Internet Group Management Protocol.
RGMP.Router-port Group Management Protocol.
3 GGP.Gateway to Gateway Protocol.
4 IP in IP encapsulation
5 ST.Internet Stream Protocol.
6 TCP.Transmission Control Protocol.
7 UCL.CBT
8 EGP.Exterior Gateway Protocol.
9 IGRP.Interior Gateway Routing Protocol.
Electera
Electera
Header checksum. 16 bits. A 16-bit one’s complement checksum of the IP header and IP options
Source IP address. 32 bits. IP address of the sender.
Destination IP address. 32 bits. IP address of the intended receiver.
This also includes multicast addresses (see Chapter 2).
Options. Variable length.
The Class D Address field contains a valid multicast group address and is used in a
report packet.
Note: In IPv4, multicasting and IGMP support are optional. However, IGMP
functions are integrated directly into IPv6, implying that all IPv6 hosts are required
to support multicasting.
IGMPv3 MESSAGES 55
In the recent past, the IGMPv2 message format has been extended in IGMPv3.
Version 3 allows receivers to subscribe to or exclude a specific set of sources within
a multicast group, rather than just an individual source (this is called, as noted, source-
specific multicast). With this feature, IGMPv 3 adds support for “source filtering,” that
is, the ability for a system to report interest in receiving packets sent to a particular
multicast address only from specific source addresses or from all but specific source
addresses. That information may be used by multicast routing protocols to avoid
delivering multicast packets from specific sources to (sub)networks where there are no
interested receivers [RFC3376]. IGMPv3 is not widely implemented as of press time.
“Source filtering” enables a multicast receiver (host) to signal to a multicast-enabled
router that groups the host that wants to receive multicast traffic from (that is, signal
membership to a multicast host group) and from which source(s) this traffic is expected.
This membership information allows a multicast-enabled router to forward traffic from
only those sources from which receivers requested the traffic. “Source filtering” supports
an atomic leave/join; this helps recovery from lost “leave” messages and simplifies some
of the error recovery scenarios. This capability also halves the message traffic, as now
only a single message is needed to “leave” one stream and “join” another.
Receivers signal membership to a multicast host group in the following two modes
[CIS200701]:
To support this capability the membership query packet (type of 0 · 11) has been
changed; in addition, a new packet type of 0 · 22 has been added. Note that all IGMPv3
implementations must still support packet types 0 · 12, 0 · 16, and 0 · 17.
In IGMPv3 there are the following three variants of the Query message:
0 8 16 31
Group Address
mant ¼ mantissa
exp ¼ exponent
(| is the Boolean or function; e.g., 0010|10000 ¼ 10010) (note that the exp calcula-
tion uses a binary representation of decimal 3, namely 11) “” is the bitwise left shift
operator. It defines the shift bits of a number to the left by a certain number of places and
zero is used for filling.
For example, 2147483646 1 leads to the following: the 32-bit binary representa-
tion of 2147483646 is 10000000000000000000000000000010. If its leftmost bit is
removed and a zero is filled at the rightmost position, the result is
00000000000000000000000000000100, which is equal to 4 in decimal representation.
Consider the example of 18 6: the 11-bit binary representation is 00000010010. If its
leftmost 6 bits are removed and six zeros are filled at the rightmost positions, the result is
10010000000 or decimal 1152. Now we are ready to apply the formula.
As an example, assume that the value of the maximum response code is decimal 178.
The bit string representation of this is 10110010. From this, the fields of the maximum
response code are
. Byte 0 ¼ 1
. exp ¼ 011
. mant ¼ 0010
Now,
Therefore, when the maximum response code is decimal 178, the maximum
response time is 1152 tenths of a second.
Small values of Max Resp Time allow IGMPv3 routers to tune the “leave latency”
(the time between the moment the last host leaves a group and the moment the routing
protocol is notified that there are no more members). Larger values allow tuning of the
burstiness of IGMP traffic on a network.
The Checksum field contains a 16-bit checksum and remains unchanged from
Version 2.
The Group Address field contains the Class D address and is the same as Version 2.
The Group Address field is set to zero when sending a general query, and set to
the IP multicast address being queried when sending a group-specific query or a
group-and-source-specific query.
The Resv field is reserved and is set to zero on transmission and is ignored on
reception.
58 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
The S Flag field is used as follows: when set to 1, this field indicates that any
receiving multicast routers should suppress the timer updates normally performed upon
receiving a query.
The QRV field (Queriers Robustness Variable) carries a parameter that is used in
tuning timer values for expected packet loss. The higher the value of the QRV, the more
tolerant the environment is for lost packets. One needs to keep in mind, however, that
increasing the QRValso increases the delay in detecting a problem. If nonzero, the QRV
field contains the [Robustness Variable] value used by the querier (that is, the sender of
the query, a multicast router). If the queriers [Robustness Variable] exceeds 7, the
maximum value of the QRV field, the QRV is set to zero. Routers adopt the QRV value
from the most recently received query as their own [Robustness Variable] value, unless
that most recently received QRV was zero, in which case the receivers use the default
[Robustness Variable] value.
The QQIC field (Queriers Query Interval Code) carries a value specifying the query
interval, in seconds, used by the originator of this query. In other words, QQIC specifies
the [Query Interval] used by the querier. The actual interval, called the Queriers Query
Interval (QQI), is represented in units of seconds and is derived from the QQIV. Multicast
routers that are not the current querier adopt the QQI value from the most recently
received query as their own [Query Interval] value, unless that most recently received
QQI was zero, in which case the receiving routers use the default [Query Interval] value.
The calculations to convert this code into the actual interval time are the same used for the
maximum response code discussed above.
Number of Sources field indicates how many source addresses are contained within
the Query message. This number is zero in a general query or a group-specific query and
nonzero in a group-and-source-specific query. This number is limited by the Maximum
Transfer Unit (MTU) of the network over which the query is transmitted. For example, on
an Ethernet with an MTU of 1500 octets, the IP header including the Router Alert option
Q2 consumes 24 octets, and the IGMP fields, including the Number of Sources (N) field,
consume 12 octets, leaving 1464 octets for source addresses, which limits the number of
source addresses to 366 (1464/4).
Source Addresses: This list of fields identifies N IP unicast addresses, where the
value N corresponds to the Number or Sources field.
As noted earlier, IGMPv3 adds a new type of 0 · 22 to support the IGMPv3 MR; see
Figure 4.4 for the format of Version 3 reports. MRs are sent by IP systems to report (to
neighboring routers) the current multicast reception state, or changes in the multicast
reception state, of their interfaces. Notice how each group record is assembled in the MR
message.
The Record Type field indicates whether the group record type is a current-state,
filter-mode-change, or source-list-change record. Current-state records (MODE_IS_
INCLUDE, MODE_IS_EXCLUDE) are records sent by a system in response to a
query received on an interface and report the current reception of that interface.
Filter-mode-change records (CHANGE_TO_INCLUDE_MODE, CHANGE_TO_
EXCLUDE_MODE) are records sent by a system when an interfaces state changes for
a particular multicast address. Source-list-change records (ALLOW_NEW_SOURCES,
BLOCK_OLD_SOURCES) are records sent by a system when an interface wishes
IGMPv3 MESSAGES 59
0 8 16 31
0x22 (type) Reserved Checksum
. . .
. . .
Source Address [N]
Auxiliary Data
to alter the list of source addresses without altering its state. See Table 4.1 for more
details.
Version 3 reports are sent with an IP destination address of 224.0.0.22, to which all
IGMPv3-capable multicast routers listen. A system that is operating in Version 1 or 2
compatibility modes sends Version 1 or 2 reports to the multicast group specified in the
Group Address field of the report. In addition, a system must accept and process any Version 1
or 2 report whose IP Destination Address field contains any of the addresses (unicast or
multicast) assigned to the interface on which the report arrives [RFC3376].
Note: Because of its higher complexity, IGMPv3 is not universally supported by all
the receiver hosts or the receiver applications as of press time; IGMPv2 is more common,
especially in IPTV applications.2
Note: All routers on a subnet must be configured for the same version of IGMP.
Note: One must be careful when using IGMPv3 with switches that support and are
enabled for IGMP snooping, because, as seen above, IGMPv3 messages are different
from the messages used in IGMPv1 and IGMPv2. If a switch does not recognize IGMPv3
messages, then hosts will not correctly receive traffic if IGMPv3 is being used. In this
case, either IGMP snooping may be disabled on the switch or the router may be
configured for IGMPv2 on the interface (which would remove the ability to use SSM
for host applications that cannot support Version 3) [CIS200701].
Note: IGMPv3 has no confirming reply: like IGMPv2, messages must be sent twice
to guard against loss, and there is no mechanism to return an error code to the client.
Note: One limitation of IGMPv3 in the context of telco networks and IPTV is as
follows [ITU200201]: An IGMPv2 PDU fits in a single ATM cell (20-byte IP header,
8 bytes of IGMP). The source filtering capability of the IGMP requires at least 52 bytes
2
Some vendors have developed IGMP Version 3 lite (IGMPv3 lite).
60 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
and, in turn, two ATM cells (20-byte IP header, 8 bytes of IGMP header, 12 bytes for the
first group record, and 12 bytes for the second group record). This doubles both the ATM
bandwidth and buffer memory required to process channel zapping messages. In turn,
this requires a more complicated ATM Segmentation and Reassembly (SAR) and
increases the amount of software processing in the multicast signaling termination
function within the access network.
Note: IGMP, as we have seen, allows an IPv4 host to communicate IP multicast
group membership information to its neighboring routers; IGMPv3 provides the ability
IGMP OPERATION 61
for a host to selectively request or filter traffic from individual sources within a multicast
group. The protocol MLD defined in RFC 2710 offers similar functionality for IPv6
hosts. MLDv2 provides the analogous source filtering functionality of IGMPv3 for IPv6
[RFC4604]. This topic is revisited in Chapter 10.
. A multicast router keeps a list of multicast group memberships and a timer for each
membership; querier routers periodically send a general MQ to solicit member-
ship information. Hosts respond to this general MQ to report their membership
status for each multicast group. Group-specific MQs may also be sent, for
example, when a router needs to check whether there are more members of a
group for which an LG message has been received. If no MRs are received for a
certain multicast group during a predefined period of time, the router assumes that
there are no more members and stops forwarding traffic for that group.
62 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
. When a host joins a multicast group, it sends an unsolicited MR for that group. To
cover the possibility of the initial MR being lost or damaged, RFC 2236
recommends that it be repeated once or twice after short delays [Unsolicited
Report Interval]. When a host leaves a multicast group, it sends an LG message.
. When a querier receives an LG message, it sends group-specific MQs to the group
being left. If no MRs are received in response to these MQs, the router assumes
that there are no more members and stops forwarding traffic for that group.
Switches Using IGMP Snooping The flooding of a network segment with multi-
cast packets, when in fact there might not be any nodes on that segment that wish to
receive these packets, can saturate an interface link, even when operating at 10/100/1000
Gbps, and/or saturate the buffers on the NIC. IGMP snooping utilizes a router to send out
IGMP Query messages to identify potentially interested receivers. Membership reports
are returned to the router, which builds a mapping table of the group and associates
forwarding filters for the member port. If no router is available, some switches
can take on the query function. IGMPv2 allows only one active querier on the network.
RFC 4541 provides mechanisms to allow switches to “snoop” on IGMP traffic. With
these mechanisms, switches can analyze the data contained within the IGMP header and
determine if the traffic needs to be forwarded to every segment to which the switch is
connected. This reduces the amount of unnecessary multicast traffic flooding to locally
attached networks that have no active receivers.
See Appendix 4.A for additional information on IGMP operation.
Figure 4.5 illustrates the process at a macrolevel; here receivers seek to access a
video multicast from an IPTV source. The basic steps are as follows [CIS200701]:
1. The receiver sends an IGMP Join message to its (designated) multicast router. As
noted above, an IGMPv2 MR packet has a type of 0 · 16; this is the Join message
(MR). The destination MAC address maps to the Class D address of the group
being joined, rather being the MAC address of the router, as covered in Chapter 2.
The body of the IGMP datagram also includes the Class D group address, as shown
in Figure 4.1, and is encapsulated in an IP datagram, as shown in Figure 4.2.
2. The router logs the Join message and utilizes PIM or another multicast routing
protocol to add this segment to the multicast distribution tree, as discussed in
Chapter 3. The action of the router must be fairly quick (typically less than 2 s and
preferably around 1 s) because the join may be the action generated by a users
STB driving the TV monitor when the viewer hits the change channel button on
his or her remote control device.
3. IP multicast traffic transmitted from the source is now distributed via the
designated router to the clients subnet. The destination MAC address corre-
sponds to the Class D address of group. In this example, and as is typical of
satellite-based IPTV systems, the “backbone” network is basically a single hop to
the telcos headend—in theory the backbone can be more complex (as in the
multirouter examples shown in Chapter 3), but then QoS considerations must be
rigorously kept in mind for high-quality entertainment video.
APPENDIX 4.A 63
I want 230.0.0.11
Content
Providers
239.0.0.11
IGMP join
239.0.0.11 Forwards stream
IGMP: How hosts tell routers
about group membership
Router adds group
Routers solicit group
membership from directly
connected hosts I want 230.0.0.23
239.0.0.33 Forwards stream
RFC2236 specifies Version 2 of IGMP Router adds group
IGMP is now in v3
239.0.0.33
According to RFC3376 IGMP Version 3 is backward
IGMP join
compatible with IGMP Versions 1 and 2
4. The switch receives the multicast packet and examines its forwarding table. If no
entry exists for the MAC address, the packet is flooded to all ports within the
broadcast domain. If an entry does exist in the switch table, the packet will be
forwarded only to the designated ports.
5. With IGMPv2, the client can cease group membership by sending an IGMP leave
to the router. With IGMPv1, the client remains a member of the group until it fails
to send a Join message in response to a query from the router. Multicast routers
also periodically send an IGMP query to the “all multicast hosts” group or to a
specific multicast group on the subnet to determine which groups are still active
within the subnet. As noted above, each host delays its response to a query by a
small random period and will then respond only if no other host in the group has
already reported. This mechanism prevents many hosts from congesting the
network with simultaneous reports.
4.A.1 Overview3
IGMP is used by IP hosts to report their multicast group memberships to any immediately
neighboring multicast routers. This appendix describes the use of IGMP between hosts
3
This section is based on RFC 2236 Copyright (C) The Internet Society (1997). All rights reserved. This
document and translations of it may be copied and furnished to others, and derivative works that comment on
or otherwise explain it or assist in its implementation may be prepared, copied, published, and distributed, in
whole or in part, without restriction of any kind, provided that the copyright notice and this paragraph are
included on all such copies and derivative works.
64 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
and routers to determine group membership. Routers that are members of multicast
groups are expected to behave as hosts as well as routers and may even respond to their
own queries. Like the Internet Control Message Protocol (ICMP), IGMP is an integral
part of IP. It is required to be implemented by all hosts wishing to receive IP multicasts.
IGMP messages are encapsulated in IP datagrams, with an IP protocol number of 2. All
IGMP messages are sent with IP TTL 1 and contain the IP Router Alert option per RFC
2113 in their IP header.
In the discussion below, timer and counter names appear in square brackets. The
term “interface” is sometimes used in the RFC 2113 to mean “the primary interface on an
attached network;” if a router has multiple physical interfaces on a single network, this
protocol need only run on one of them. Hosts, on the other hand, need to perform their
actions on all interfaces that have memberships associated with them.
timer for the specified group and does not send a report in order to suppress duplicate
reports.
When a router receives a report, it adds the group being reported to the list of
multicast group memberships on the network on which it received the report and sets
the timer for the membership to the [Group Membership Interval]. Repeated reports
refresh the timer. If no reports are received for a particular group before this timer
has expired, the router assumes that the group has no local members and that it
need not forward remotely originated multicasts for that group onto the attached
network.
When a host joins a multicast group, it is expected to immediately transmit an
unsolicited Version 2 MR for that group, in case it is the first member of that group on the
network. To cover the possibility of the initial MR being lost or damaged, it is
recommended that it be repeated once or twice after short delays [Unsolicited Report
Interval]. (A simple way to accomplish this is to send the initial Version 2 MR and
then act as if a group-specific query was received for that group and set a timer
appropriately.)
When a host leaves a multicast group, if it was the last host to reply to a query with an
MR for that group, it should send a Leave Group message to the all-routers multicast group
(224.0.0.2). If it was not the last host to reply to a query, it may send nothing as there must be
another member on the subnet. This is an optimization to reduce traffic; a host without
sufficient storage to retain status information as to whether it was the last host to reply may
always send a Leave Group message when it leaves a group. Routers should accept a Leave
Group message addressed to the group being left in order to accommodate implementations
of an earlier version of this standard. Leave Group messages are addressed to the all-routers
group because other group members do not need to know that a host has left the group, but it
does no harm to address the message to the group.
When a querier receives a Leave Group message for a group that has
group members on the reception interface, it sends [Last Member Query Count]
group-specific queries every [Last Member Query Interval] to the group being left.
These group-specific queries have their Max Response Time set to [Last Member
Query Interval]. If no reports are received after the response time of the last query
expires, the routers assume that the group has no local members, as above. Any querier-
to-nonquerier transition is ignored during this time; the same router keeps sending the
group-specific queries.
Nonqueriers must ignore Leave Group messages, and queriers should ignore Leave
Group messages for which there are no group members on the reception interface.
When a nonquerier receives a Group-Specific Query message, if its existing group
membership timer is greater than [Last Member Query Count] times the Max Response
Time specified in the message, it sets its group membership timer to that value.
. “nonmember” state, when the host does not belong to the group on the interface.
This is the initial state for all memberships on all network interfaces; it requires no
storage in the host.
. “Delaying member” state, when the host belongs to the group on the interface and
has a report delay timer running for that membership.
. “Idle Member” state, when the host belongs to the group on the interface and does
not have a report delay timer running for that membership.
The following five significant events can cause IGMP state transitions:
. “Join group” occurs when the host decides to join the group on the interface. It may
occur only in the Nonmember state.
APPENDIX 4.A 67
. “Leave group” occurs when the host decides to leave the group on the interface. It
may occur only in the Delaying Member and Idle Member states.
. “Query received” occurs when the host receives either a valid General Member-
ship Query message or a valid Group-Specific Membership Query message. To be
valid, the Query message must be at least eight octets long and have a correct
IGMP checksum. The group address in the IGMP header must either be zero (a
general query) or a valid multicast group address (a group-specific query). A
general query applies to all memberships on the interface from which the query is
received. A group-specific query applies to membership in a single group on the
interface from which the query is received. Queries are ignored for memberships
in the Nonmember state.
. “Report received” occurs when the host receives a valid IGMP MR message
(Version 1 or 2). To be valid, the Report message must be at least eight octets
long and have a correct IGMP checksum. An MR applies only to the
membership in the group identified by the MR, on the interface from which
the MR is received. It is ignored for memberships in the Nonmember or Idle
Member state.
. “Timer expired” occurs when the report delay timer for the group on the interface
expires. It may occur only in the Delaying Member state.
All other events, such as receiving invalid IGMP messages or IGMP messages other
than Query or Report, are ignored in all states.
There are seven possible actions that may be taken in response to the above events as
follows:
. “Send report” for the group on the interface. The type of report is determined by
the state of the interface. The Report message is sent to the group being reported.
. “Send leave” for the group on the interface. If the interface state says the querier is
running IGMPv1, this action should be skipped. If the flag saying we were the last
host to report is cleared, this action may be skipped. The Leave message is sent to
the all-routers group (224.0.0.2).
. “Set flag” that we were the last host to send a report for this group.
. “Clear flag” since we were not the last host to send a report for this group.
. “Start timer” for the group on the interface, using a delay value chosen uniformly
from the interval (0, Max Response Time], where Max Response Time is specified
in the query. If this is an unsolicited report, the timer is set to a delay value chosen
uniformly from the interval (0, [Unsolicited Report Interval]].
. “Reset timer” forthegroupontheinterfaceto a new value, using a delay value chosen
uniformly from the interval (0, Max Response Time], as described in “start timer.”
. “Stop timer” for the group on the interface.
In the following state diagrams (Figures 4A.2–4A.4), each state transition arc is
labeled with the event that causes the transition and, in parentheses, any actions taken
68 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
Gen.
Other
Nonquerier
during the transition. Note that the transition is always triggered by the event; even if the
action is conditional, the transition still occurs.
The all-systems group (address 224.0.0.1) is handled as a special case. The host
starts in the Idle Member state for that group on every interface, never transitions to
another state, and never sends a report for that group.
In addition, a host may be in one of two possible states with respect to any single
network interface:
. “No IGMPv1 Router Present,” when the host has not heard an IGMPv1 style query
for the [Version 1 Router Present Timeout]. This is the initial state.
. “IGMPv1 Router Present,” when the host has heard an IGMPv1 style query within
the [Version 1 Router Present Timeout].
APPENDIX 4.A 69
Timer
Timer (notify routing —
routing —
–
Rexmt
Membership
(notify routing —)
Report
Report received
Report received
A router may be in one of two possible states with respect to any single attached
network:
The following three events can cause the router to change states:
. “Query timer expired” occurs when the timer set for query transmission
expires.
. “Query received from a router with a lower IP address” occurs when an IGMP
membership query is received from a router on the same network with a lower IP
address.
APPENDIX 4.A 71
. “Other querier present timer expired” occurs when the timer set to note the
presence of another querier with a lower IP address on the network expires.
There are three actions that may be taken in response to the above events:
A router should start in the Initial state on all attached networks and immediately
move to the querier state.
In addition, to keep track of which groups have members, a router may be in one of
the four possible states with respect to any single IP multicast group on any single
attached network:
. “No Members Present” state, when there are no hosts on the network which have
sent reports for this multicast group. This is the initial state for all groups on the
router; it requires no storage in the router.
. “Members Present” state, when there is a host on the network that has sent an MR
for this multicast group.
. “Version 1 Members Present” state, when there is an IGMPv1 host on the network
which has sent a Version 1 MR for this multicast group.
. “Checking Membership” state, when the router has received a Leave Group
message but has not yet heard an MR for the multicast group.
There are six significant events that can cause router state transitions:
. “v2 report received” occurs when the router receives a Version 2 MR for the group
on the interface. To be valid, the Report message must be at least eight octets long
and must have a correct IGMP checksum.
. “v1 report received” occurs when the router receives a Version 1 MR for the group
on the interface. The same validity requirements apply.
. “Leave received” occurs when the router receives an IGMP Group Leave message
for the group on the interface. To be valid, the Leave message must be at least
eight octets long and must have a correct IGMP checksum.
. “Timer expired” occurs when the timer set for a group membership expires.
. “Retransmit timer expired” occurs when the timer set to retransmit a
group-specific membership query expires.
. “v1 host timer expired” occurs when the timer set to note the presence of Version 1
hosts as group members expires.
72 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
There are six possible actions that may be taken in response to the above events:
. “Start timer” for the group membership on the interface—also resets the timer to
its initial value [Group Membership Interval] if the timer is currently running.
. “Start timer*” for the group membership on the interface—this alternate action
sets the timer to [Last Member Query Interval] · [Last Member Query Count] if
this router is a querier or the [Max Response Time] in the packet · [Last Member
Query Count] if this router is a nonquerier.
. “Start retransmit timer” for the group membership on the interface [Last Member
Query Interval].
. “Start v1 host timer” for the group membership on the interface, also resets the timer
to its initial value [Group Membership Interval] if the timer is currently running.
. “Send group-specific query” for the group on the attached network. The
group-specific query is sent to the group being queried and has a Max Response
Time of [Last Member Query Interval].
. “Notify routing þ,” notifies the routing protocol that there are members of this
group on this connected network.
. “Notify routing ,” notifies the routing protocol that there are no longer any
members of this group on this connected network.
The state diagram for a router in the Querier state is shown in Figure 4A.3.
The state diagram for a router in the Nonquerier state is similar, but nonqueriers do
not send any messages and are only driven by message reception; see Figure 4A.4. Note
that nonqueriers do not care whether a MR message is Version 1 or 2.
This appendix describes the concept of IGMP snooping. It is summarized from and based
on concepts discussed in [RFC4541].
The IEEE bridge standard [IEEE Std. 802.1D-2004, IEEE Standard for Local and
Metropolitan Area Networks, Media Access Control (MAC) Bridges] specifies how
LAN packets are “bridged” (switched) between LAN segments. Traditionally, when
processing a packet whose destination MAC address is a multicast address, the switch
forwards, a copy of the packet into each of the remaining network interfaces that are in
the forwarding state; the spanning tree algorithm ensures that the application of this
rule at every switch in the network makes the packet accessible to all nodes connected
to the network. In recent years, however, vendors have introduced products described
as “IGMP snooping switches.” IGMP snooping switches utilize information in the
upper level protocol headers as factors to be considered in processing at the lower
levels. This is in contrast to the normal switch behavior where multicast traffic is
typically forwarded on all interfaces. IGMP snooping switches filter packets addressed
to unrequested group addresses because, in a generic multicast environment, signifi-
cant bandwidth can be wasted by flooding all ports [RFC4541]. Note that the IGMP
APPENDIX 4.B 73
snooping function applies only to IPv4 multicasts. For IPv6, MLD must be used
instead.
1. A snooping switch forwards IGMP MRs to only those ports where multicast
routers are attached. In other words, a snooping switch does not forward IGMP
MRs to ports on which only hosts (receivers) are attached. An administrative
control may be provided to override this restriction, allowing the Report messages
to be flooded to other ports. This is the main IGMP snooping functionality for the
control path.
Sending MRs to other hosts can result in unintentionally preventing a host
from joining a specific multicast group. When an IGMPv1 or IGMPv2 host
receives a MR for a group address that it intends to join, the host will suppress
its own MR for the same group. This join or message suppression is a
requirement for IGMPv1 and IGMPv2 hosts. However, if a switch does not
receive a MR from the host, it will not forward multicast data to it. This is not a
problem in an IGMPv3—only in the network because there is no suppression of
IGMP MRs. The administrative control allows IGMP MR messages to be
processed by network monitoring equipment such as packet analyzers or port
replicators.
The switch supporting IGMP snooping must maintain a list of multicast
routers and the ports on which they are attached. This list can be constructed in
any combination of the following ways:
(a) This list is built by the snooping switch sending Multicast Router Solicitation
messages as described in IGMP multicast router discovery. It may also snoop
Multicast Router Advertisement messages sent by and to other nodes.
(b) The arrival port for IGMP queries (sent by multicast routers) where the source
address is not 0.0.0.0. The 0.0.0.0 address represents a special case where the
switch is proxying IGMP queries for faster network convergence but is not
itself the querier. The switch does not use its own IP address (even if it has one)
because this would cause the queries to be seen as coming from a newly elected
querier. The 0.0.0.0 address is used to indicate that the query packets are not
from a multicast router.
(c) Ports explicitly configured by management to be IGMP-forwarding ports, in
addition to or instead of any of the above methods to detect router ports.
2. IGMP networks may also include devices that implement “proxy reporting,” in
which reports received from downstream hosts are summarized and used to build
internal membership states. Such proxy-reporting devices may use the all-zeros
IP source address when forwarding any summarized reports upstream. For this
reason, IGMP MRs received by the snooping switch must not be rejected because
the source IP address is set to 0.0.0.0.
74 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
3. The switch that supports IGMP snooping must flood all unrecognized IGMP
messages to all other ports and must not attempt to make use of any information
beyond the end of the network layer header. In addition, earlier versions of IGMP
should interpret IGMP fields as defined for their versions and must not alter these
fields when forwarding the message. When generating new messages, a given
IGMP version should set fields to the appropriate values for its own version. If
any fields are reserved or otherwise undefined for a given IGMP version, the
fields should be ignored when parsing the message and must be set to zeros when
new messages are generated by implementations of that IGMP version. An
exception may occur if the switch is performing a spoofing function and is aware
of the settings for new or reserved fields that would be required to correctly spoof
for a different IGMP version.
4. An IGMP snooping switch is typically aware of link-layer topology changes
caused by a spanning tree operation. When a port is enabled or disabled by a
spanning tree, a general query may be sent on all active nonrouter ports to reduce
network convergence time. Nonquerier switches should be aware of whether the
querier is in IGMPv3 mode. If so, the switch does not spoof any general queries
unless it is able to send an IGMPv3 query that adheres to the most recent
information sent by the true querier. In no case should a switch introduce a spoofed
IGMPv2 query into an IGMPv3 network, as this may create excessive network
disruption. If the switch is not the querier, it should use the “all-zeros” IP source
address in these proxy queries (even though some hosts may elect to not process
queries with a 0.0.0.0 IP source address). When such proxy queries are received,
they must not be included in the querier election process.
5. The snooping switch must not rely exclusively on the appearance of IGMP Group
Leave announcements to determine when entries should be removed from the
forwarding table. It should implement a membership timeout mechanism, such
as the router-side functionality of the IGMP, as described in the IGMP and MLD
specifications, on all its nonrouter ports. This timeout value should be
configurable.
1. Packets with a destination IP address outside 224.0.0.X, which are not IGMP, will
be forwarded according to group-based port membership tables and must also be
forwarded on router ports. This is the main IGMP snooping functionality for the
data path. One approach that an implementation could take would be to maintain
separate membership and multicast router tables in software and then “merge”
these tables into a forwarding cache.
2. Packets with a Destination IP (DIP) address in the 224.0.0.X range, which are not
IGMP, must be forwarded on all ports. Many host systems do not send join IP
multicast addresses in this range before sending or listening to IP multicast
packets. Furthermore, since the 224.0.0.X address range is defined as link local
(not to be routed), it seems unnecessary to keep the state for each address in this
APPENDIX 4.B 75
range. Additionally, some routers operate in the 224.0.0.X address range without
issuing IGMP joins, and these applications would break if the switch were to
prune them due to not having seen a Join Group message from the router.
3. An unregistered packet is defined as an IPv4 multicast packet with a destination
address that does not match any of the groups announced in earlier IGMP MRs. If
a switch receives an unregistered packet, it must forward that packet on all ports
to which an IGMP router is attached. A switch may default to forwarding
unregistered packets on all ports. Switches that do not forward unregistered
packets to all ports must include a configuration option to force the flooding of
unregistered packets on specified ports.
In an environment where IGMPv3 hosts are mixed with snooping switches that
do not yet support IGMPv3, the switchs failure to flood unregistered streams
could prevent Version 3 hosts from receiving their traffic. Alternatively, in
environments where the snooping switch supports all of the IGMP versions that
are present, flooding unregistered streams may cause IGMP hosts to be over-
whelmed by multicast traffic, even to the point of not receiving queries and
failing to issue new MRs for their own groups.
Snooping switches must at least recognize and process IGMPv3 join reports,
even if this processing is limited to the behavior for IGMPv2 joins, that is, is done
without considering any additional “include source” or “exclude source” filter-
ing. When IGMPv3 joins are not recognized, a snooping switch may incorrectly
prune off the unregistered data streams for the groups (as noted above);
alternatively, it may fail to add in forwarding to any new IGMPv3 hosts if the
group has previously been joined as IGMPv2 (because the data stream is seen as
already having been registered).
4. All non-IPv4 multicast packets should continue to be flooded out to all remaining
ports in the forwarding state as per normal IEEE bridging operations. This
recommendation is a result of the fact that groups made up of IPv4 hosts and IPv6
hosts are completely separate and distinct groups. As a result, information
gleaned from the topology among members of an IPv4 group would not be
applicable when forming the topology among members of an IPv6 group.
5. IGMP snooping switches may maintain forwarding tables based on either MAC
addresses or IP addresses. If a switch supports both types of forwarding tables,
then the default behavior should be to use IP addresses. IP-address-based
forwarding is preferred because the mapping between IP multicast addresses
and link-layer multicast addresses is ambiguous. In the case of Ethernet, there is a
multiplicity of 1 Ethernet address to 32 IP addresses.
6. Switches which rely on information in the IP header should verify that the IP
header checksum is correct. If the checksum fails, the information in the packet
must not be incorporated into the forwarding table. Further, the packet should be
discarded.
7. When IGMPv3 “include source” and “exclude source” MRs are received on
shared segments, the switch needs to forward the superset of all received MRs on
to the shared segment. Forwarding of traffic from a particular source S to a group G
76 DYNAMIC HOST REGISTRATION—INTERNET GROUP MANAGEMENT PROTOCOL
must happen if at least one host on the shared segment reports an IGMPv3
membership of the type INCLUDE(G, S list 1) or EXCLUDE(G, S list 2), where S
is an element of S list 1 and not an element of S list 2. The practical implementation
of the (G,S1,S2, . . .)-based data forwarding tables are not within the scope of this
document. However, one possibility is to maintain two (G,S) forwarding lists—
one for the INCLUDE filter where a match of a specific (G,S) is required before
forwarding will happen and one for the EXCLUDE filter where a match of a
specific (G,S) will result in no forwarding.
Command Purpose
Step 1 Router(config)# interface type Selects an interface that is
number connected to hosts on which
IGMPv3 can be enabled.
Step 2 Router(config-if)# ip pim Enables PIM on an interface.
{sparse-mode | One must use either sparse
sparse-dense-mode} mode or sparse–dense
mode.
Step 3 Router(config-if)# ip igmp Enables IGMPv3 on this inter-
version 3 face. The default version of
IGMP is set to Version 2.
To verify that IGMPv3 is configured properly, one can use the following show
commands:
REFERENCES
Q3 [CIS200701] Cisco Systems, Internet Protocol (IP) Multicast Technology Overview, White Paper,
Cisco Systems, Inc., San Jose, CA.
[ITU200201] ITU-T FS-VDSL Telecommunication Standardization Sector of ITU, Full-Service
VDSL Focus Group, Channel Change Protocol, Version 1.00, November 29 2002.
[PAR200601] L. Parziale, W. Liu, et al., TCP/IP Tutorial and Technical Overview, IBM Press,
Redbook Abstract, IBM Form Number GG24-3376-07, 2006.
[RFC2236] RFC 2236, Internet Group Management Protocol, Version 2, W. Fenner, November
1997.
[RFC3376] RFC 3376, Internet Group Management Protocol, Version 3, B. Cain, S. Deering, I.
Kouvelas, B. Fenner, A. Thyagarajan, October 2002.
[RFC4541] RFC 4541, Considerations for Internet Group Management Protocol (IGMP)
and Multicast Listener Discovery (MLD) Snooping Switches, M. Christensen, K. Kimball,
F. Solensky, May 2006. (status: informational).
[RFC4604] RFC4604, Using Internet Group Management Protocol, Version 3 (IGMPv3) and
Multicast Listener Discovery Protocol, Version 2 (MLDv2) for Source-Specific Multicast,
H. Holbrook, B. Cain, B. Haberman, August 2006.
5
MULTICAST ROUTING—
SPARSE-MODE PROTOCOLS:
PROTOCOL INDEPENDENT
MULTICAST
In Chapter 3, we discussed multicast forwarding algorithms and how they are used to
establish efficient (multicast) paths in the network. A number of multicast routing
protocols have been developed over the years that implement and support these algo-
rithms. The next few chapters describe these protocols. This chapter opens the discussion
by examining the Protocol Independent Multicast (PIM). PIM has two modes of
operation: PIM DM, specified in RFC 3973, and PIM SM, specified in RFC 2362. This
chapter focuses on the SM (DM is covered in Chapter 7). PIM does not send and/or receive
multicast routing updates between routers as is the case in IP routing: instead of building
up a completely separate multicast routing table, PIM uses the unicast routing information
to support the multicast forwarding function. At the same time, it is independent of IP
routing, namely, it can make use of any underlying unicast routing protocols utilized to
manage the unicast routing table, including EIGRP, OSPF, BGP, or even static routes.
Specifically, PIM uses the unicast routing table to perform the RPF check function. PIM is
implemented by the leading router manufacturers, particularly by Cisco Systems, and is
widely deployed. A preliminary overview in Section 5.1 is followed by a more detailed
description of the protocol in Section 5.2.
78
INTRODUCTION TO PIM 79
A number of algorithms have been developed over the years. The RPF algorithm
discussed in Chapter 3 uses a multicast delivery tree to enable the forwarding of packets from
the source to each member in the multicast group. With RPF, packets are replicated only at
the furthest branches in the delivery tree where connectivity is still possible. See Figure 5.1.
To identify the routers that require connectivity to support receiver membership to
individual groups, distribution trees are established and updated dynamically. Using this
process, duplicate packets that may be generated by network loops are discarded along the
way. A reverse path table is used at each nodal router to maintain a map between every
known source and the optimal(ly preferred) interface required to reach that source. When
a nodal router needs to forward multicast packets along, it does so based on the following
rule: if the packet arrived over the interface used to transmit data back to the source, the
packet is forwarded through every appropriate downstream interface; otherwise, if the
Router 2
Router 1
subnetwork R1 Join
Group G
Receiver Receiver
Receiver Receiver
Receiver Receiver
Multicast delivery path
from source=S
and group address=G
packet arrived through a suboptimal path, it is discarded. RPF has the following
characteristics:
. Traffic follows the shortest path from the source to each destination.
. A different tree is computed for each source node.
. Packet delivery is distributed over multiple network links.
is typically used when there are few receivers in a group, when senders and receivers are
separated by point-to-point WAN links, and when the type of traffic is intermittent. This
is generally the case for datacasting applications. Note that modern PIM routers are able
to simultaneously support DM for some multipoint groups and SM for others. SM PIM is
optimized for environments where there are many multipoint data streams and each data
stream is required by a relatively small number of subnets. PIM SM postulates that no
hosts want the multicast traffic unless they specifically ask for it. PIM provides the ability
to switch between SM and DM and also permits both modes to be used within the same
group.
As noted, PIM SM works by defining an RP. An RP is the point in the network where
multicast sources connect to multicast receivers. When a sender wishes to send data, it
first sends it to the RP, and when a receiver wishes to receive data, it registers with the RP.
After the data stream begins to flow from sender to RP to receiver, the routers in the path
optimize the path automatically to remove any unnecessary hops.
Senders register their requests with the RP and in so doing join a tree rooted at the RP.
The RP is similar to the center point used in the center-based tree algorithm. Initially,
traffic from the sender flows through the RP to reach each receiver. The benefit of a SM
protocol is that multicast data is blocked from a network segment unless a receiver
specifically signals to receive the data. This reduces the amount of traffic traversing the
network for those cases where indeed there are just a few receivers on the network at any
given point in time (as noted, this is not generally the model for IPTV or DVB-H). This
approach also implies that no pruning information is maintained for routers without
active receivers; pruning information is maintained only in routers connected to the
multicast delivery tree. In typical implementations (especially in enterprise networks),
the RP is administratively configured; sources register with the RP and then data is
forwarded down the shared tree to the receivers.2
Given the membership requirements of typical multicast (datacasting) applications
over the Internet, PIM SM is currently the most popular multicast routing protocol in use
(PIM SM scales well to a network of any size). The reason is that a multicast source
connected over the Internet may have a (relatively) small set of potential receivers that
are distributed all over the globe. PIM SM makes good sense. On the contrary, current
IPTV/DVB-H applications are delivered over private networks (the Internet is not
generally used because of QoS considerations) and tend to be concentrated to a
geographic area [market served by a telco, such as a city, county, or DMA (Demographic
Marketing Area)].
There are a number of protocols for multicast routing within an AS that predate
PIM, which are discussed in the chapters that follow, but PIM is the most widely
2
Cisco has implemented an alternative to choosing just DM or just SM on a router interface. This is called
sparse–dense mode. This was necessitated by a change in the paradigm for forwarding multicast traffic via PIM
that became apparent during its development. It turned out that it was more efficient to choose sparse or dense on
a per-group basis rather than a per-router interface basis. Sparse–dense mode facilitates this ability. Network
administrators can also configure sparse–dense mode. This configuration option allows individual groups to be
run in either sparse or dense mode depending on whether RP information is available for that group. If the router
learns RP information for a particular group, it will be treated as SM, otherwise that group will be treated as DM
[CIS200701].
82 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
Step 1: A multicast router that has active receivers sends periodic PIM Join messages
to a group-specific RP. Each multicast router along the path, with the RP being on the
upstream, generates and issues PIM Join requests to the RP. This process builds a
group-specific multicast delivery reverse path tree rooted at the RP. Figure 5.2
depicts this operation. Note that the PIM Join requests follow a reverse path from the
receiver to the RP, building out the reverse path tree.
Step 2: The multicast router supporting the source initially encapsulates each
multicast packet in a register message sent to the RP. The RP deencapsulates these
unicast messages and forwards the packets to the set of downstream receivers, as
depicted in Figure 5.3.
PIM SM DETAILS 83
Source Host
A
Multicast stream
RP
Receiver
Receiver
Step 3: The RP-based tree may need optimization. At this juncture the router
supporting the source can create a source-based multicast delivery tree (e.g., see
Figure 5.4).
Step 4: When the downstream router starts to receive multicast packets through both
the RP-based delivery tree and the source-based delivery tree, it generates PIM Prune
messages, which are sent upstream toward the RP. See Figure 5.5. This causes the RP
to prune this branch of the tree. When this process is complete, multicast data from
the source is forwarded only through the source-based delivery tree.
Figure 5.6 depicts the basic format of a PIM message. Section 5.2 provides
additional details on this topic; casual readers may focus only on the first few sections
that follow—more advanced readers may want to go through the entire section.
This section (based on RFC 23623) describes in some detail PIM SM, which as we
have seen is a protocol that can be used for efficiently routing to multicast groups
3
Copyright (C) The Internet Society (1998). All rights reserved. This document and translations of it may be
copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its
implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of
any kind, provided that the copyright notice and this paragraph are included on all such copies and derivative
works.
84 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
Source Host
A
RP
Receiver
Receiver
Source Host
A
RP
Receiver
Receiver
Receiver Receiver
PIM Join Toward Source
Source-Based Delivery Tree
Multicast Data
Source Host
A
RP
Receiver
Receiver
Receiver Receiver
Source-based delivery tree
now carrying multicast data
PIM prune message
that may span wide-area (and interdomain) internets. This description is for
informative value only; developers should refer directly to the latest RFC/release
of the protocol.
5.2.1 Approach
Section 5.2.2 summarizes PIM SM operation; it describes the protocol from a network
perspective, specifically, how the participating routers interact to create and maintain the
multicast distribution tree. Section 5.2.3 describes PIM SM operations from the
perspective of a single router implementing the protocol; this section constitutes the
main body of the protocol specification. It is organized according to PIM SM message
type; for each message type, the section describes its contents, its generation, and its
processing. Section 5.2.4 provides packet format details.
5.2.2.3 Hosts Sending to a Group. When a host starts sending multicast data
packets to a group, initially, its DR must deliver each packet to the RP for distribution
down the RP-tree. The senders DR initially encapsulates each data packet in a Register
message and unicasts it to the RP for that group. The RP decapsulates each Register
message and forwards the enclosed data packet natively to downstream members on the
shared RP-tree.
If the data rate of the source warrants the use of a source-specific SPT, the RP may
construct a new multicast route entry that is specific to the source, hereafter referred to
as (S,G) state, and send periodic Join/Prune messages toward the source. Note that over
time the rules for when to switch can be modified without global coordination. When
and if the RP does switch to the SPT, the routers between the source and the RP build
88 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
and maintain the (S,G) state in response to these messages and send (S,G) messages
upstream toward the source.
ThesourcesDRmust stopencapsulatingdatapacketsinregisterswhen(andsolongas)
it receives Register-Stop messages from the RP. The RP triggers Register-Stop messages in
response to registers if the RP has no downstream receivers for the group (or for that
particular source) or if the RP has already joined the (S,G) tree and is receiving the data
packets natively. Each sources DR maintains, per (S,G), a register-suppression timer.
The register-suppressiontimer is startedby the Register-Stopmessage;upon expiration, the
sources DR resumes sending data packets to the RP, encapsulated in Register messages.
A router with directly connected members first joins the shared RP-tree. The router
can switch to a sources shortest path tree (SP-tree) after receiving packets from that
source over the shared RP-tree. The recommended policy is to initiate the switch to the
SP-tree after receiving a significant number of data packets during a specified time
interval from a particular source. To realize this policy, the router can monitor data
packets from sources for which it has no source-specific multicast route entry and initiate
such an entry when the data rate exceeds the configured threshold.
When an (S,G) entry is activated (and periodically so long as the state exists), a Join/
Prune message is sent upstream toward the source S, with S in the join list. The payload
contains Multicast-Address¼G, Join¼S, Prune¼NULL. When the (S,G) entry is created,
the outgoing interface list is copied from (*,G), that is, all local shared tree branches are
replicated in the new SP-tree. In this way, when a data packet from S arrives and matches
on this entry, all receivers will continue to receive the sources packets along this path. (In
more complicated scenarios, other entries in the router have to be considered.) Note that
the (S,G) state must be maintained in each last-hop router that is responsible for initiating
and maintaining an SP-tree. Even when (*,G) and (S,G) overlap, both states are needed to
trigger the source-specific Join/Prune messages. The (S,G) state is kept alive by data
packets arriving from that source. A timer, [Entry-Timer], is set for the (S,G) entry and this
timer is restarted whenever data packets for (S,G) are forwarded out at least one oif or
registers are sent. When the entry timer expires, the state is deleted. The last-hop router is
the router that delivers the packets to their ultimate end-system destination. This is the
router that monitors if there is group membership and joins or prunes the appropriate
distribution trees in response. In general, the last-hop router is the DR for the LAN.
However, under various conditions described later, a parallel router connected to the same
LAN may take over as the last-hop router in place of the DR.
Only the RP and routers with local members can initiate switching to the sp-tree;
intermediate routers do not. Consequently, last-hop routers create an (S,G) state in
response to data packets from the source S, whereas intermediate routers only create an
(S,G) state in response to Join/Prune messages from downstream that have S in the join list.
The (S,G) entry is initialized with the SPT-bit cleared, indicating that the SP-tree
branch from S has not yet been set up completely, and the router can still accept packets
from S that arrive on the (*,G) entrys indicated incoming interface (iif). Each PIM
multicast entry has an associated incoming interface on which packets are expected to
arrive.
When a router with an (S,G) entry and a cleared SPT-bit starts to receive packets
from the new source S on the iif for the (S,G) entry, and that iif differs from the (*,G)
PIM SM DETAILS 89
entrys iif, the router sets the SPT-bit and sends a Join/Prune message toward the RP,
indicating that the router no longer wants to receive packets from S via the shared
RP-tree. The Join/Prune message sent toward the RP includes S in the prune list, with
the RPT-bit set indicating that Ss packets must not be forwarded down this branch of
the shared tree. If the router receiving the Join/Prune message has an (S,G) state (with
or without the route entrys RPT-bit flag set), it deletes the arriving interface from the
(S,G) oif list. If the router has only a (*,G) state, it creates an entry with the RPT-bit
flag set to 1. For brevity, one refers to an (S,G) entry that has the RPT-bit flag set to 1 as
an (S,G)RPT-bit entry. This notational distinction is useful to point out the different
actions taken for (S,G) entries depending on the setting of the RPT-bit flag. Note that a
router can have no more than one active (S,G) entry for any particular S and G at any
particular time; whether the RPT-bit flag is set or not. In other words, a router never has
both an (S,G) and an (S,G)RPT-bit entry for the same S and G at the same time. The
Join/Prune message payload contains Multicast-Address ¼ G, Join ¼ NULL, Prune ¼
S, RPT-bit.
A new receiver may join an existing RP-tree on which a source-specific prune state
has been established (e.g., because downstream receivers have switched to SP-trees). In
this case, the prune state must be eradicated upstream of the new receiver to bring all
sources data packets down to the new receiver. Therefore, when an (*,G) join arrives at a
router that has any (S,G)RPT-bit entries (i.e., entries that cause the router to send
source-specific prunes toward the RP), these entries must be updated upstream of
the router so as to bring all sources packets down to the new member. To accomplish
this, each router that receives an (*,G) Join/Prune message updates all the existing
(S,G)RPT-bit entries. The router may also trigger an (*,G) Join/Prune message upstream
to cause the same updating of RPT-bit settings upstream and pull down all active sources
packets. If the arriving (*,G) join has some sources included in its prune list, then the
corresponding (S,G) RPT-bit entries are left unchanged (i.e., the RPT-bit remains set and
no oif is added).
A domain in this context is a contiguous set of routers that implement all PIM and is
configured to operate within a common boundary defined by PIM Multicast Border
Routers (PMBRs). PMBRs connect each PIM domain to the rest of the Internet.
Routers use a set of available RPs (called the RP-set) distributed in Bootstrap
messages to get the proper group-to-RP mapping. The following paragraphs summa-
rize the mechanism; details of the mechanism may be found in RFC 2362. A (small) set
of routers, within a domain, is configured as Candidate BSRs (C-BSRs) and, through a
simple election mechanism, a single BSR is selected for that domain. A set of routers
within a domain is also configured as Candidate RPs (C-RPs); typically, these will be
the same routers that are configured as C-BSRs. C-RPs periodically unicast C-RP-
Advertisement messages (C-RP-Advs) to the BSR of that domain. C-RP-Advs include
the address of the advertising C-RP, as well as an optional group address and a Mask
Length field, indicating the group prefix(es) for which the candidacy is advertised. The
BSR then includes a set of these C-RPs (the RP-set), along with the corresponding
group prefixes, in Bootstrap messages it periodically originates. Bootstrap messages
are distributed hop-by-hop throughout the domain.
Routers receive and store Bootstrap messages originated by the BSR. When a DR
gets a membership indication from IGMP for (or a data packet from) a directly connected
host, for a group for which it has no entry, the DR uses a hash function to map the group
address to one of the C-RPs whose group prefix includes the group. The DR then sends a
Join/Prune message toward (or unicasts registers to) that RP.
The Bootstrap message indicates liveness of the RPs included therein. If an RP is
included in the message, then it is tagged as “up” at the routers, while RPs not included in
the message are removed from the list of RPs over which the hash algorithm acts. Each
router continues to use the contents of the most recently received Bootstrap message until
it receives a new Bootstrap message.
If a PIM domain partitions, each area separated from the old BSR will elect its own
BSR, which will distribute an RP-set containing RPs that are reachable within that
partition. When the partition heals, another election will occur automatically and only
one of the BSRs will continue to send out Bootstrap messages. As is expected at the
time of a partition or healing, some disruption in packet delivery may occur. This time
will be on the order of the regions round-trip time and the bootstrap router timeout
value.
A data packet will match on an (*,*,RP) entry if there is no more specific entry [such
as (S,G) or (*,G)], and the destination group address in the packet maps to the RP listed in
the (*,*,RP) entry. In this sense, an (*,*,RP) entry represents an aggregation of all the
groups that hash to that RP. PMBRs initialize the (*,*,RP) state for each RP in the
domains RP-set. The (*,*,RP) state causes the PMBRs to send (*,*,RP) Join/Prune
messages toward each of the active RPs in the domain. As a result, distribution trees are
built that carry all data packets originated within the PIM domain (and sent to the RPs)
down to the PMBRs.
PMBRs are also responsible for delivering externally generated packets to routers
within the PIM domain. To do so, PMBRs initially encapsulate externally originated
packets (i.e., received on DVMRP interfaces) in Register messages and unicast them to
the corresponding RP within the PIM domain. The Register message has a bit indicating
that it was originated by a border router and the RP caches the originating PMBRs
address in the route entry so that duplicate registers from other PMBRs can be declined
with a Register-Stop message.
All PIM routers must be capable of supporting the (*,*,RP) state and interpreting
associated Join/Prune messages.
same as one of those, the packet is forwarded to the oif list of the matching
entry.
(c) Otherwise, the iif does not match any entry for G and the packet is discarded.
Data packets never trigger prunes. However, data packets may trigger actions that in
turn trigger prunes.
Designated Router Election. When there are multiple routers connected to a multiaccess
network, one of them must be chosen to operate as the DR at any point in time. The DR is
responsible for sending triggered Join/Prune and Register messages toward the RP.
A simple DR election mechanism is used for both SM and traditional IP multicast
routing. Neighboring routers send Hello messages to each other. The sender with the
largest network layer address assumes the role of DR. Each router connected to the
multiaccess LAN sends the hellos periodically in order to adapt to changes in router
status.
be compared with another metric value provided both metric preferences are the same. A
metric preference can be assigned per unicast routing protocol and needs to be consistent
for all routers on the multiaccess network.
Asserts are also needed for (*,G) entries since an RP-tree and an SP-tree for the same
group may both cross the same multiaccess network. When an assert is sent for a (*,G)
entry, the first bit in the metric preference (RPT-bit) is always set to 1 to indicate that this
path corresponds to the RP-tree and that the match must be done on (*,G) if it exists.
Furthermore, the RPT-bit is always cleared for metric preferences that refer to SP-tree
entries; this causes an SP-tree path to always look better than an RP-tree path. When the
SP-tree and RP-tree cross the same LAN, this mechanism eliminates the duplicates that
would otherwise be carried over the LAN.
In case the packet or the Assert message matches on oif for the (*,*,RP) entry, a (*,G)
entry is created, and asserts take place as if the matching state were (*,G).
The DR may lose the (*,G) assert process to another router on the LAN if there are
multiple paths to the RP through the LAN. From then on, the DR is no longer
the last-hop router for local receivers and removes the LAN from its (*,G) oif list.
The winning router becomes the last-hop router and is responsible for sending (*,G)
Join messages to the RP.
incoming interface, if the link is operational, to inform upstream routers that this part of
the distribution tree is going away.
5.2.3.1 Hello. Hello messages are sent so neighboring routers can discover each
other.
SENDING HELLOS. Hello messages are sent periodically between PIM neighbors,
every [Hello-Period] seconds. This informs routers what interfaces have PIM neighbors.
Hello messages are multicast using address 224.0.0.13 (ALL-PIM-ROUTERS group).
The packet includes a holdtime, set to [Hello-Holdtime], for neighbors to keep the
information valid. Hellos are sent on all types of communication links.
RECEIVING HELLOS. When a router receives a Hello message, it stores the network
layer address for that neighbor, sets its neighbor timer for the hello sender to the holdtime
included in the hello, and determines the DR for that interface. The highest addressed
system is elected DR. Each hello received causes the DRs address to be updated.
When a router that is the active DR receives a hello from a new neighbor (i.e., from
an address that is not yet in the DR’s neighbor table), the DR unicasts its most recent
RP-set information to the new neighbor.
TIMING OUT NEIGHBOR ENTRIES. A periodic process is run to time out PIM
neighbors that have not sent hellos. If the DR has gone down, a new DR is chosen
by scanning all neighbors on the interface and selecting the new DR to be the one with the
highest network layer address. If an interface has gone down, the router may optionally
time out all PIM neighbors associated with the interface.
5.2.3.2 Join/Prune. Join/Prune messages are sent to join or prune a branch off of
the multicast distribution tree. A single message contains both a join and prune list, either
one of which may be null. Each list contains a set of source addresses, indicating the
source-specific trees or shared tree that the router wants to join or prune.
messages are only sent if the RPF neighbor is a PIM neighbor. A periodic Join/Prune
message sent to a particular RPF neighbor is constructed as follows:
1. Each router determines the RP for a (*,G) entry by using the hash function
described. The RP address (with RPT- and WC-bits set) is included in the join list
of a periodic Join/Prune message under the following conditions:
(a) The Join/Prune message is being sent to the RPF neighbor toward the RP for
an active (*,G) or (*,*,RP) entry, and
(b) The outgoing interface list in the (*,G) or (*,*,RP) entry is non-NULL, or the
router is the DR on the same interface as the RPF neighbor.
2. A particular source address, S, is included in the join list with the RPT and
WC-bits cleared under the following conditions:
(a) The Join/Prune message is being sent to the RPF neighbor toward S,
(b) There exists an active (S,G) entry with the RPT-bit flag cleared, and
(c) The oif list in the (S,G) entry is not null.
3. A particular source address, S, is included in the prune list with the RPT and WC
bits cleared under the following conditions:
(a) The Join/Prune message is being sent to the RPF neighbor toward S,
(b) There exists an active (S,G) entry with the RPT-bit flag cleared, and
(c) The oif list in the (S,G) entry is null.
4. A particular source address, S, is included in the prune list with the RPT-bit set
and the WC-bit cleared under the following conditions:
(a) The Join/Prune message is being sent to the RPF neighbor toward the RP and
there exists an (S,G) entry with the RPT-bit flag set and null oif list, or
(b) The Join/Prune message is being sent to the RPF neighbor toward the RP,
there exists an (S,G) entry with the RPT-bit flag cleared and SPT-bit set, and
the incoming interface toward S is different than the incoming interface
toward the RP, or
(c) The Join/Prune message is being sent to the RPF neighbor toward the RP, and
there exists an (*,G) entry and (S,G) entry for a directly connected source.
5. The RP address (with RPT- and WC-bits set) is included in the prune list if
(a) The Join/Prune message is being sent to the RPF neighbor toward the RP and
there exists a (*,G) entry with a null oif list.
that there are no longer directly connected members, the oif is removed from the
oif list if the oif timer is not running. A Join/Prune message is triggered if and
only if (a) a new entry is created or (b) the oif list changes from null to nonnull or
nonnull to null, as follows:
(a) If the receiving router does not have a route entry for G, the router creates a
(*,G) entry, copies the oif list from the corresponding (*,*,RP) entry (if it
exists), and includes the interface included in the IGMP membership
indication in the oif list; as always, the router never includes the entrys
iif in the oif list. The router sends a Join/Prune message toward the RP with
the RP address and RPT- and WC-bits set in the join list. Or,
(b) If an (S,G)RPT-bit or (*,G) entry already exists, the interface included in the
IGMP membership indication is added to the oif list (if it was not included
already).
2. Receipt of a Join/Prune message for (S,G), (*,G), or (*,*,RP) will cause building
or modifying the corresponding state, and subsequent triggering of upstream
Join/Prune messages, in the following cases:
(a) When there is no current route entry, the RP address included in the Join/
Prune message is checked against the local RP-set information. If it matches,
an entry will be created and the new entry will in turn trigger an upstream
Join/Prune message. If the router has no RP-set information, it may discard
the message or optionally use the RP address included in the message.
(b) When the outgoing interface list of an (S,G) RPT-bit entry becomes null, the
triggered Join/Prune message will contain S in the prune list.
(c) When there exists an (S,G)RPT-bit with null oif list and a (*,G) Join/Prune
message is received, the arriving interface is added to the oif list and a (*,G)
Join/Prune message is triggered upstream.
(d) When there exists a (*,G) with null oif list and a (*,*,RP) Join/Prune message
is received, the receiving interface is added to the oif list and a (*,*,RP) Join/
Prune message is triggered upstream.
3. Receipt of a packet that matches an (S,G) entry whose SPT-bit is cleared triggers
the following if the packet arrived on the correct incoming interface and there is a
(*,G) or (*,*,RP) entry with a different incoming interface: (a) the router sets the
SPT-bit on the (S,G) entry and (b) the router sends a Join/Prune message toward
the RP with S in the prune list and the RPT-bit set.
4. Receipt of a packet at the DR from a directly connected source S, on the subnet
containing the address S, triggers a Join/Prune message toward the RP with S in
the prune list and the RPT-bit set under the following conditions: (a) there is no
matching (S,G) state and (b) there exists a (*,G) or (*,*,RP) for which the DR is
not the RP.
5. When a Join/Prune message is received for a group G, the prune list is checked. If
the prune list contains a source or RP for which the receiving router has a
corresponding active (S,G), (*,G) or (*,*,RP) entry, and whose iif is that on which
the join/prune was received, then a join for (S,G), (*,G), or (*,*,RP) is triggered to
PIM SM DETAILS 97
One does not trigger prunes onto interfaces based on data packets. Data packets that
arrive on the wrong incoming interface are silently dropped. However, on point-to-point
interfaces, triggered prunes may be sent as an optimization.
It is possible that a Join/Prune message constructed according to the preceding rules
could exceed the MTU of a network. In this case, the message can undergo semantic
fragmentation, whereby information corresponding to different groups can be sent in
different messages. However, if a Join/Prune message must be fragmented, the complete
prune list corresponding to a group G must be included in the same Join/Prune message as
the associated RP-tree join for G. If such semantic fragmentation is not possible, IP
fragmentation should be used between the two neighboring hops.
RECEIVING JOIN/PRUNE MESSAGES. When a router receives a Join/Prune message, it
processes it as follows:
The receiver of the join/prune notes the interface on which the PIM message arrived,
call it I. The receiver then checks to see if the Join/Prune message was addressed
to the receiving router itself (i.e., the routers address appears in the Unicast
Upstream Neighbor Router field of the Join/Prune message). (If the router is
connected to a multiaccess LAN, the message could be intended for a different
router.) If the join/prune is for this router, the following actions are taken.
For each group address, G, in the Join/Prune message, the associated join list is
processed as follows. One refers to each address in the join list as Sj; Sj refers to
the RP if the RPT-bit and WC-bit are both set. For each Sj in the join list of the
Join/Prune message:
1. If an address, Sj, in the join list of the Join/Prune message has the RPT-bit and
WC-bit set, then Sj is the RP address used by the downstream router(s) and the
following actions are taken:
(a) If Sj is not the same as the receiving routers RP mapping for G, the
receiving router may ignore the Join/Prune message with respect to that
98 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
group entry. If the router does not have any RP-set information, it may
use the address Sj included in the Join/Prune message as the RP for the
group.
(b) If Sj is the same as the receiving routers RP mapping for G, the receiving
router adds I to the outgoing interface list of the (*,G) route entry [if there is
no (*,G) entry, the router creates one first] and sets the Oif-timer for that
interface to the holdtime specified in the Join/Prune message. In addition, the
oif-deletion delay for that interface is set to one-third the holdtime specified
in the Join/Prune message. If a (*,*,RP) entry exists, for the RP associated
with G, then the oif list of the newly created (*,G) entry is copied from that
(*,*,RP) entry.
(c) For each (Si,G) entry associated with group G: (i) if Si is not included in the
prune list, (ii) if I is not on the same subnet as the address Si, and (iii) if I is not
the iif, then interface I is added to the oif list and the oif timer for that interface
in each affected entry is increased (never decreased) to the Holdtime
included in the Join/Prune message. In addition, if the oif timer for that
interface is increased, the oif deletion delay for that interface is set to
one-third the holdtime specified in the Join/Prune message.
If the group address in the Join/Prune message is “*,” then every (*,G) and
(S,G) entry, whose group address hashes to the RP indicated in the (*,*,RP)
join/prune message, is updated accordingly. A “*” in the group field of the
join/prune is represented by a group address 224.0.0.0 and a group mask
length of 4, indicating a (*,*,RP) join.
(d) If the (Si,G) entry has its RPT-bit flag set to 1, and its oif list is the same as the
(*,G) oif list, then the (Si,G)RPT-bit entry is deleted,
(e) The incoming interface is set to the interface used to send unicast packets to
the RP in the (*,G) route entry, that is, RPF interface toward the RP.
2. For each address, Sj, in the join list whose RPT-bit and WC-bit are not set, and
for which there is no existing (Sj,G) route entry, the router initiates one. The
router creates an (S,G) entry and copies all outgoing interfaces from the (S,G)
RPT-bit entry, if it exists. If there is no (S,G) entry, the oif list is copied from the
(*,G) entry; and if there is no (*,G) entry, the oif list is copied from the (*,*,RP)
entry, if it exists. In all cases, the iif of the (S,G) entry is always excluded from
the oif list.
(a) The outgoing interface for (Sj,G) is set to I. The incoming interface for (Sj,G)
is set to the interface used to send unicast packets to Sj (i.e., the RPF neighbor).
(b) If the interface used to reach Sj is the same as I, this represents an error (or a
unicast routing change) and the join/prune must not be processed.
3. For each address, Sj, in the join list of the Join/Prune message, for which there is
an existing (Sj,G) route entry:
(a) If the RPT-bit is not set for Sj listed in the Join/Prune message, but the
RPT-bit flag is set on the existing (Sj,G) entry, the router clears the RPT-bit
flag on the (Sj,G) entry, sets the incoming interface to point toward Sj for that
PIM SM DETAILS 99
For each group address G, in the Join/Prune message, the associated prune list is
processed as follows. One refers to each address in the prune list as Sp; Sp refers to the RP
if the RPT-bit and WC-bit are both set. For each Sp in the prune list of the Join/Prune
message:
1. For each address, Sp, in the prune list whose RPT-bit and WC-bit are cleared:
(a) If there is an existing (Sp,G) route entry, the router lowers the entrys oif timer
for I to its oif-deletion delay, allowing for other downstream routers on a
multiaccess LAN to override the prune. However, on point-to-point links, the
oif timer is expired immediately.
(b) If the router has a current (*,G), or (*,*,RP), route entry, and if the existing
(Sp,G) entry has its RPT-bit flag set to 1, then this (Sp,G)RPT-bit entry is
maintained (not deleted) even if its outgoing interface list is null.
2. For each address, Sp, in the prune list whose RPT-bit is set and whose WC-bit is
cleared:
(a) If there is an existing (Sp G) route entry, the router lowers the entrys oif timer
for I to its oif-deletion delay, allowing for other downstream routers on a
multiaccess LAN to override the prune. However, on point-to-point links, the
oif timer is expired immediately.
(b) Iftherouterhasacurrent(*,G),or(*,*,RP),routeentry,andiftheexisting(Sp,G)
entryhasitsRPT-bitflagsetto1,thenthis(Sp,G)RPT-bitentryisnotdeleted,and
the entry timer is restarted, even if its outgoing interface list is null.
(c) If (*,G), or corresponding (*,*,RP), state exists, but there is no (Sp,G) entry,
an (Sp,G)RPT-bit entry is created. The outgoing interface list is copied from
the (*,G), or (*,*,RP), entry, with the interface I on which the prune was
received deleted. Packets from the pruned source Sp match on this state and
are not forwarded toward the pruned receivers.
(d) If there exists an (Sp,G) entry, with or without the RPT-bit set, the oif timer
for I is expired, and the entry timer is restarted.
3. For each address, Sp, in the prune list whose RPT-bit and WC-bit are both set:
(a) If there is an existing (*,G) entry, with Sp as the RP for G, the router lowers
the entrys oif timer for I to its oif-deletion delay, allowing for other
100 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
For any new (S,G), (*,G), or (*,*,RP) entry created by an incoming Join/Prune
message, the SPT-bit is cleared (and if a join/prune-suppression timer is used, it is left off).
If the entry has a join/prune-suppression timer associated with it, and if the
received join/prune does not indicate the router as its target, then the receiving router
examines the join and prune lists to see if any addresses in the list “completely match”
the existing (S,G), (*,G), or (*,*,RP) state for which the receiving router currently
schedules Join/Prune messages. An element on the join or prune list “completely
matches” a route entry only if both the addresses and RPT-bit flag are the same. If the
incoming Join/Prune message completely matches an existing (S,G), (*,G), or (*,*,RP)
entry and the join/prune arrived on the iif for that entry, then the router compares the
holdtime included in the Join/Prune message, to its own [Join/Prune-Holdtime]. If its
own [Join/Prune-Holdtime] is lower, the join/prune-suppression timer is started at the
[Join/Prune-Suppression-Timeout]. If the [Join/Prune-Holdtime] is equal, the tie is
resolved in favor of the Join/Prune Message originator that has the higher network layer
address. When the join/prune timer expires, the router triggers a Join/Prune message
for the corresponding entry(ies).
1. Decapsulates the data packet and checks for a corresponding (S,G) entry.
(a) If an (S,G) entry with a cleared (0) SPT-bit exists, and the received register
does not have the null-register bit set to 1, the packet is forwarded, and the
SPT-bit is left cleared (0). If the SPT-bit is 1, the packet is dropped, and
Register-Stop messages are triggered. Register-stops should be rate limited
(in an implementation-specific manner) so that not more than a few are sent
per round-trip time. This prevents a high-data-rate stream of packets from
triggering a large number of Register-Stop messages between the time that
the first packet is received and the time when the source receives the first
register-stop.
(b) If there is no (S,G) entry, but there is a (*,G) entry, and the received register
does not have the null-register bit set to 1, the packet is forwarded according
to the (*,G) entry.
(c) If there is an (*,*,RP) entry but no (*,G) entry, and the register received does
not have the null-register bit set to 1, a (*,G) or (S,G) entry is created and the
oif list is copied from the (*,*,RP) entry to the new entry. The packet is
forwarded according to the created entry.
(d) If there is no G or (*,*,RP) entry corresponding to G, the packet is dropped,
and a register-stop is triggered.
(e) A “border bit” is added to the Register message to facilitate interoperability
mechanisms. PMBRs set this bit when registering for external sources. If the
“border bit” is set in the register, the RP does the following:
i. If there is no matching (S,G) state, but there exists a (*,G) or (*,*,RP) entry,
the RP creates an (S,G) entry, with a “PMBR” field. This field holds the
102 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
source of the register (i.e., the outer network layer address of the register
packet). The RP triggers an (S,G) join towards the source of the data
packet and clears the SPT-bit for the (S,G) entry. If the received register
is not a “null register,” the packet is forwarded according to the created
state. Else:
ii. If the “PMBR” field for the corresponding (S,G) entry matches the
source of the register packet and the received register is not a “null
register,” the de-encapsulated packet is forwarded to the oif list of that
entry. Else:
iii. If the “PMBR” field for the corresponding (S,G) entry matches the source
of the register packet, the deencapsulated packet is forwarded to the oif
list of that entry. Else:
iv. The packet is dropped, and a register-stop is triggered toward the source
of the register.
The (S,G) entry timer is restarted by registers arriving from that source to that group.
2. If the matching (S,G) or (*,G) state contains a null oif list, the RP unicasts a
Register-Stop message to the source of the Register message; in the latter case,
the Source Address field, within the Register-Stop message, is set to the wildcard
value (all 0s). This message is not processed by intermediate routers, hence, no
(S,G) state is constructed between the RP and the source.
3. If the Register message arrival rate warrants it and there is no existing (S,G) entry,
the RP sets up an (S,G) route entry with the outgoing interface list, excluding iif
(S,G), copied from the (*,G) outgoing interface list, and its SPT-bit is initialized
to 0. If a (*,G) entry does not exist, but there exists a (*,*,RP) entry with the RP
corresponding to G, the oif list for (S,G) is copied—excluding the iif—from that
(*,*,RP) entry.
A timer [Entry-Timer] is set for the (S,G) entry and this timer is restarted by receipt
of data packets for (S,G). The (S,G) entry causes the RP to send a Join/Prune message for
the indicated group toward the source of the Register message.
If the (S,G) oif list becomes null, Join/Prune messages will not be sent toward the
source S.
1. Look up the route state based on the longest match of the source address and an
exact match of the destination address in the data packet. If neither S nor G find
the longest match entry, and the RP for the packets destination group address has
a corresponding (*,*,RP) entry, then the longest match does not require an exact
match on the destination group address. In summary, the longest match is
performed in the following order: (1) (S,G), (2) (*,G). If neither is matched,
then a lookup is performed on (*,*,RP) entries.
PIM SM DETAILS 103
2. If the packet arrived on the interface found in the matching entrys iif field and the
oif list is not null:
(a) Forward the packet to the oif list for that entry, excluding the subnet
containing S, and restart the entry timer if the matching entry is (S,G).
Optionally, the (S,G) entry timer may be restarted by periodic checking of the
matching packet count.
(b) If the entry is an (S,G) entry with a cleared SPT-bit, and a (*,G) or associated
(*,*,RP)alsoexistswhoseincominginterfaceisdifferent thanthatfor(S,G),set
theSPT-bitforthe(S,G)entryandtriggeran(S,G)RPT-bit prunetowardtheRP.
(c) If the source of the packet is a directly connected host and the router is the DR
on the receiving interface, check the register-suppression timer associated
with the (S,G) entry. If it is not running, then the router encapsulates the data
packet in a Register message and sends it to the RP.
This covers the common case of a packet arriving on the RPF interface to
the source or RP and being forwarded to all joined branches. It also detects
when packets arrive on the SP-tree and triggers their pruning from the
RP-tree. If it is the DR for the source, it sends data packets encapsulated in
registers to the RPs.
3. If the packet matches to an entry but did not arrive on the interface found in the
entrys iif field, check the SPT-bit of the entry. If the SPT-bit is set, drop the
packet. If the SPT-bit is cleared, then look up the (*,G), or (*,*,RP), entry for G. If
the packet arrived on the iif found in (*,G), or the corresponding (*,*,RP),
forward the packet to the oif list of the matching entry. This covers the case when
a data packet matches an (S,G) entry for which the SP-tree has not yet been
completely established upstream.
4. If the packet does not match any entry but the source of the data packet is a local,
directly connected host and the router is the DR on a multiaccess LAN and
has RP-set information, the DR uses the hash function to determine the RP
associated with the destination group G. The DR creates an (S,G) entry, with the
register-suppression timer not running, encapsulates the data packet in a Register
message, and unicasts it to the RP.
5. If the packet does not match any entry and it is not a local host or the router is not
the DR, drop the packet.
threshold (t2), an (S,G) entry is created and a Join/Prune message is sent toward the
source. If the RPF interface for (S,G) is not the same as that for (*,G) or (*,*,RP), then the
SPT-bit is cleared in the (S,G) entry.
Other configured rules may be enforced to cause or prevent establishment of the
(S,G) state.
5.2.3.5 Assert. asserts are used to resolve which of the parallel routers con-
nected to a multiaccess LAN is responsible for forwarding packets onto the LAN.
SENDING ASSERTS. The following assert rules are provided when a multicast packet
is received on an outgoing multiaccess interface “I” of an existing active (S,G), (*,G), or
(*,*,RP) entry.
1. Do unicast routing table lookup on source address from data packet, and send
assert on interface “I” for source address in data packet; include metric prefer-
ence of routing protocol and metric from routing table lookup.
2. If route is not found, use metric preference of 0·7fffffff and metric 0·ffffffff.
When an assert is sent for a (*,G) entry, the first bit in the metric preference
(the RPT-bit) is set to 1, indicating the data packet is routed down the RP-tree.
RECEIVING ASSERTS. When an assert is received, the router performs the longest
match on the source and group address in the Assert message, only active entries—that
have packet forwarding state—are matched. The router checks the first bit of the metric
preference (RPT-bit).
1. If the RPT-bit is set, the router first does a match on (*,G), or (*,*,RP), entries; if
no matching entry is found, it ignores the assert.
2. If the RPT-bit is not set in the assert, the router first does a match on (S,G) entries;
if no matching entry is found, the router matches (*,G) or (*,*,RP) entries.
Receiving Asserts on an Entrys Outgoing Interface. If the interface that received the
Assert message is in the oif list of the matched entry, then this assert is processed by this
router as follows:
1. If the asserts RPT-bit is set and the matching entry is (*,*,RP), the router creates a
(*,G) entry. If the asserts RPT-bit is cleared and the matching entry is (*,G), or
(*,*,RP), the router creates an (S,G)RPT-bit entry. Otherwise, no new entry is
created in response to the assert.
2. The router then compares the metric values received in the assert with the metric
values associated with the matched entry. The RPT-bit and metric preference (in
that order) are treated as the high-order part of an assert metric comparison. If the
value in the assert is less than the routers value (with ties broken by the IP
PIM SM DETAILS 105
address, where the higher network layer address wins), delete the interface from
the entry. When the deletion occurs for a (*,G) or (*,*,RP) entry, the interface is
also deleted from any associated (S,G)RPT-bit or (*,G) entries, respectively. The
entry timer for the affected entries is restarted.
3. If the router has won the election, the router keeps the interface in its outgoing
interface list. It acts as the forwarder for the LAN.
The winning router sends an Assert message containing its own metric to that
outgoing interface. This will cause other routers on the LAN to prune that interface from
their route entries. The winning router sets the RPT-bit in the Assert message if a (*,G) or
(S,G)RPT-bit entry was matched.
Receiving Asserts on an Entrys Incoming Interface. If the assert arrived on the incoming
interface of an existing (S,G), (*,G), or (*,*,RP) entry, the assert is processed as follows.
If the Assert message does not match the entry exactly, it is ignored; that is, the longest
match is not used in this case. If the Assert message does match exactly, then:
1. Downstream routers will select the upstream router with the smallest metric
preference and metric as their RPF neighbor. If two metrics are the same, the
highest network layer address is chosen to break the tie. This is important so that
downstream routers send subsequent joins/prunes (in SM) to the correct neigh-
bor. An assert timer is initiated when changing the RPF neighbor to the assert
winner. When the timer expires, the router resets its RPF neighbor according to
its unicast routing tables to capture network dynamics and router failures.
2. If the downstream routers have downstream members, and if the assert caused the
RPF neighbor to change, the downstream routers must trigger a Join/Prune
message to inform the upstream router that packets are to be forwarded on the
multiaccess network.
SENDING C-RP-ADVS. C-RPs periodically unicast C-RP-Advs to the BSR for that
domain. The interval for sending these messages is subject to local configuration at the
C-RP.
106 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
C-RP-Advs carry Group Address and Group Mask fields. This enables the
advertising router to limit the advertisement to certain prefixes or scopes of groups.
The advertising router may enforce this scope acceptance when receiving registers or
Join/Prune messages. C-RPs should send C-RP-Adv messages with the Priority field
set to 0.
1. If the router is not the elected BSR, it ignores the message. Else:
2. The BSR adds the RP address to its local pool of candidate RPs, according to
the associated group prefix(es) in the C-RP-Adv message. The holdtime in
the C-RP-Adv message is also stored with the corresponding RP, to be
included later in the Bootstrap message. The BSR may apply a local policy to
limit the number of C-RPs included in the Bootstrap message. The BSR may
override the prefix indicated in a C-RP-Adv unless the Priority field is not
zero.
The BSR keeps an RP timer per RP in its local RP-set. The RP timer is initialized to
the holdtime in the RPs C-RP-Adv. When the timer expires, the corresponding RP is
removed from the RP set. The RP timer is restarted by the C-RP-Advs from the
corresponding RP.
The BSR also uses its bootstrap timer to periodically send Bootstrap messages. In
particular, when the bootstrap timer expires, the BSR originates a Bootstrap message
on each of its PIM interfaces. To reduce the Bootstrap message overhead during
partition healing, the BSR should set a random time (as a function of the priority and
address) after which the Bootstrap message is originated only if no other preferred
Bootstrap message is received. The message is sent with a TTL of 1 to the “ALL-PIM-
ROUTERS” group. In steady state, the BSR originates Bootstrap messages periodi-
cally. At startup, the bootstrap timer is initialized to [Bootstrap-Timeout], causing
the first Bootstrap message to be originated only when and if the timer expires.
For timer details, see the next section, “Receiving and Forwarding Bootstrap.” A
DR unicasts a Bootstrap message to each new PIM neighbor, that is, after the DR
receives the neighbors Hello message (it does so even if the new neighbor becomes
the DR).
The Bootstrap message is subdivided into sets of Group-Prefix,RP-Count,RP-
addresses. For each RP address, the corresponding holdtime is included in the RP-
Holdtime field. The format of the Bootstrap message allows “semantic fragmentation” if
the length of the original Bootstrap message exceeds the packet maximum boundaries
(see Section 5.2.4). However, the RFC recommends against configuring a large number
of routers as C-RPs to reduce the semantic fragmentation required.
1. If the message was not sent by the RPF neighbor toward the BSR address
included, the message is dropped. Else:
2. If the included BSR is not preferred over and not equal to the currently active BSR:
(a) If the bootstrap timer has not yet expired or if the receiving router is a C-BSR,
then the Bootstrap message is dropped. Else:
(b) If the bootstrap timer has expired and the receiving router is not a C-BSR, the
receiving router stores the RP-set and BSR address and priority found in the
message and restarts the timer by setting it to [Bootstrap-Timeout]. The
Bootstrap message is then forwarded out of all PIM interfaces, excluding the
one over which the message arrived, to the “ALL-PIM-ROUTERS” group,
with a TTL of 1.
3. If the Bootstrap message includes a BSR address that is preferred over or
equal to the currently active BSR, the router restarts its bootstrap timer at
[Bootstrap-Timeout] seconds and stores the BSR address and RP-set informa-
tion.
The Bootstrap message is then forwarded out of all PIM interfaces, excluding
the one over which the message arrived, to the “ALL-PIM- ROUTERS” group,
with a TTL of 1.
4. If the receiving router has no current RP-set information and the bootstrap was
unicast to it from a directly connected neighbor, the router stores the information
as its new RP-set. This covers the startup condition when a newly booted router
obtains the RP-set and BSR address from its DR.
When a router receives a new RP-set, it checks if each of the RPs referred to by the
existing state [i.e., by (*,G), (*,*,RP), or (S,G)RPT-bit entries] is in the new RP-set. If an
RP is not in the new RP-set, that RP is considered unreachable and the hash algorithm
(see below) is reperformed for each group with locally active state that previously hashed
to that RP. This will cause those groups to be distributed among the remaining RPs. When
the new RP-set contains a new RP, the value of the new RP is calculated for each group
covered by that C-RPs group prefix. Any group for which the new RPs value is greater
than the previously active RPs value is switched over to the new RP.
5.2.3.7 Hash Function. The hash function is used by all routers within a
domain to map a group to one of the C-RPs from the RP-set. For a particular group
G, the hash function uses only those C-RPs whose group prefix covers G. The algorithm
takes as input the group address and the addresses of the candidate RPs and gives as
output one RP address to be used.
The protocol requires that all routers hash to the same RP within a domain (except
for transients). The following hash function must be used in each router:
1. For RP addresses in the RP-set whose group prefix covers G, select the RP with
the highest priority (i.e., the lowest “priority” value), and compute a value
108 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
Ties between RPs having the same hash value and priority are broken in advantage of
the highest address.
The hash function algorithm is invoked by a DR upon reception of a packet or IGMP
membership indication for a group for which the DR has no entry. It is invoked by any
router that has a (*,*,RP) state when a packet is received for which there is no
corresponding (S,G) or (*,G) entry. Furthermore, the hash function is invoked by all
routers upon receiving a (*,G) or (*,*,RP) Join/Prune message.
5.2.3.8 Processing Timer Events. This section enumerates all timers that have
been discussed or implied. Since some critical timer events are not associated with the
receipt or sending of messages, they are not fully covered by earlier sections.
Timers are implemented in an implementation-specific manner. For example, a
timer may count up or down or may simply expire at a specific time. Setting a timer to a
value T means that it will expire after T seconds.
TIMERS RELATED TO TREE MAINTENANCE. Each (S,G), (*,G), and (*,*,RP) route
entry has multiple timers associated with it: one for each interface in the outgoing interface
list, one for the multicast routing entry itself, and one optional join/prune-suppression
timer. Each (S,G) and (*,G) entry also has an assert timer and a random-delay-join
timer for use with asserts. In addition, DRs have a register-suppression timer for each
(S,G) entry and every router has a single join/prune timer. (A router may optionally keep
separate join/prune timers for different interfaces or route entries if different join/prune
periods are desired.)
The following table shows its usage when first adding the oif to the entrys oif list,
when it should be restarted (unless it is already higher), and when it should be decreased
(unless it is already lower).
Set to When Applies to
included adding oif (S,G) (*,G)
Holdtime off Join/Prune (*,*,RP)
Increased (only) When Applies to
to included received (S,G) (*,G)
Holdtime Join/Prune (*,*,RP)
(*,*,RP) (*,*,RP) oif-timer (S,G) (*,G)
oif-timer value restarted
(*,G) oif-timer (*,G) oif-timer (S,G)
value restarted
When the timer expires, the oif is removed from the oif list if there are no directly
connected members. When deleted, the oif is also removed in any associated (S,G) or
(*,G) entries.
. [Entry-Timer (kept per route entry)]: A timer for each route entry is used to
time out that entry. The following table summarizes its usage when first adding
110 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
the oif to the entrys oiflist and when it should be restarted (unless it is already
higher).
When the timer expires, the route entry is deleted; if the entry is a (*,G) or (*,*,RP)
entry, all associated (S,G)RPT-bit entries are also deleted.
DEFAULT TIMER VALUES. Most of the default timeout values for state information are
3.5 times the refresh period. For example, hellos refresh the Neighbor state and the
default hello timer period is 30 s, so a default neighbor timer duration of 105 s is included
in the holdtime field of the hellos. In order to improve convergence, however, the default
timeout value for information related to RP liveness and Bootstrap messages is 2.5 times
the refresh period.
In this version of the spec, the RFC suggests particular numerical timer settings; it is
possible to specify a mechanism for timer values to be scaled based upon observed
network parameters.
112 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
5.2.3.9 Summary of Flags Used. Following is a summary of all the flags used
in this scheme.
0 ¼ Hello
1 ¼ Register
2 ¼ Register-Stop
3 ¼ Join/Prune
4 ¼ Bootstrap
5 ¼ Assert
6 ¼ Graft (used in PIM DM only)
7 ¼ Graft-Ack (used in PIM DM only)
8 ¼ Candidate-RP-Advertisement
Addr Family: The address family of the Unicast Address field of this address.
Here is the address family numbers assigned by IANA:
Number Description
0 Reserved
1 IP (IP Version 4)
2 IP6 (IP Version 6)
3 NSAP
4 HDLC (8-bit multidrop)
5 BBN 1822
6 802 (includes all 802 media plus Ethernet “canonical format”)
7 E.163
8 E.164 (SMDS, frame relay, ATM)
9 F.69 (Telex)
10 X.121 (X.25, frame relay)
11 IPX
12 Appletalk
13 Decnet IV
14 Banyan Vines
15 E.164 with NSAP format subaddress
Encoding Type: The type of encoding used within a specific address family. The
value “0” is reserved for this field and represents the native encoding of the address
family.
Unicast Address: The unicast address as represented by the given address family and
encoding type.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Addr Family | Encoding Type | Reserved | Mask Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group multicast Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.9. Encoded-Group Address
116 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Addr Family | Encoding Type | Rsrvd |S|W|R| Mask Len |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.10. Encoded-Source Address
given address family and encoding type (for example, 32 for IPv4 native encoding and
128 for IPv6 native encoding).
The group multicast address contains the group address.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| OptionType | OptionLength |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| OptionValue |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+++
| . |
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| OptionType | OptionLength |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| OptionValue |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+++
Figure 5.11. Hello Message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|B|N| Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
Multicast data packet
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.12. Register Message
118 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.13. Register-Stop Message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-Upstream Neighbor Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved | Num groups | Holdtime |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Multicast Group Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Joined Sources | Number of Pruned Sources |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Joined Source Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Joined Source Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Pruned Source Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Pruned Source Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Multicast Group Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Joined Sources | Number of Pruned Sources |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Joined Source Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Joined Source Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Pruned Source Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Pruned Source Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Number of Joined Sources: Number of join source addresses listed for a given
group.
Join Source Address 1 . . . n: This list contains the sources that the sending router will
forward multicast datagrams for if received on the interface this message is sent on.
See Section 5.2.4.1. The field explanation for the encoded-source address format
follows:
Reserved: Described above.
S—The Sparse bit is a 1-bit value, set to 1 for PIM SM. It is used for PIMv1
compatibility.
W—The WC-bit is a 1 bit value. If 1, the join or prune applies to the (*,G) or
(*,*,RP) entry. If 0, the join or prune applies to the (S,G) entry where S is the source
address. Joins and prunes sent toward the RP must have this bit set.
R—The RPT-bit is a 1-bit value. If 1, the information about (S,G) is sent toward the
RP. If 0, the information must be sent toward S, where S is the source address.
Mask Length, Source Address: Described above.
Represented in the form of
<WC-bit><RPT-bit><Mask length><Source address>:
Number of Pruned Sources: Number of prune source addresses listed for a group.
Prune Source Address 1 . . . n: This list contains the sources that the sending
router does not want to forward multicast datagrams for when received on the interface
this message is sent on. If the Join/Prune message boundary exceeds the maximum
packet size, then the join and prune lists for the same group must be included in the same
packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Fragment Tag | Hash Mask len | BSR-priority |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-BSR-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP-Count-1 | Frag RP-Cnt-1 | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP1-Holdtime | RP1-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP2-Holdtime | RP2-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-m |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RPm-Holdtime | RPm-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address-2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP-Count-n | Frag RP-Cnt-n | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP1-Holdtime | RP1-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RP2-Holdtime | RP2-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address-m |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RPm-Holdtime | RPm-Priority | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
5.2.4.7 Assert Message. The Assert message is sent when a multicast data
packet is received on an outgoing interface corresponding to the (S,G) or (*,G) associated
with the source. See Figure 5.16.
PIM Version, Type, Reserved, and Checksum: Described above.
Encoded-Group Address: The group address to which the data packet was
addressed and which triggered the assert. Format previously described.
Encoded-Unicast-Source: Address source address from multicast datagram that
triggered the assert packet to be sent. The format for this address is given in the
encoded-unicast address in Section 4.1.
“R”: RPT-bit is a 1-bit value. If the multicast datagram that triggered the assert
packet is routed down the RP-tree, then the RPT-bit is 1; if the multicast datagram is
routed down the SPT, it is 0.
Metric Preference: Preference value assigned to the unicast routing protocol that
provided the route to the host address.
PIM SM DETAILS 123
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R| Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.16. Assert Message
Metric: The unicast routing table metric. The metric is in units applicable to the
unicast routing protocol used.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Prefix-Cnt | Priority | Holdtime |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Unicast-RP-Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| . |
| . |
| . |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Encoded-Group Address-n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.17. Candidate-RP-Advertisement
124 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS
REFERENCES
[CIS200701] Cisco Systems, Internet Protocol (IP) Multicast Technology Overview, Cisco
Systems, Inc., San Jose, CA.
[PAR200601] L. Parziale, W. Liu, et al., TCP/IP Tutorial and Technical Overview, IBM Press,
Redbook Abstract, IBM Form Number GG24-3376-07, 2006.
[RFC2117] RFC 2117, Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol
Specification, D. Farinacci, A. Helmy, et al., June 1997 (obsoleted by RFC 2362).
[RFC2201] RFC 2201, Core Based Trees (CBT) Multicast Routing Architecture, A. Ballardie,
September 1997.
[RFC2362] RFC 2362, Protocol Independent Multicast-Sparse Mode (PIM-SM): Protocol
Specification, D. Estrin, D. Farinacci, et al., June 1998.
[WEL200101] P. J. Welcher, The Protocols of IP Multicast, White Paper, Chesapeake
NetCraftsmen, Arnold, MD.
6
MULTICAST
ROUTING—SPARSE-MODE
PROTOCOLS: CORE-BASED
TREES
This chapter looks at multicast approaches that make use of Core-Based Trees (CBTs).
The CBT protocol, defined in RFC 2201 and RFC 2189, is designed to build and maintain
a shared multicast distribution tree that spans only networks and links that connect to
active receivers. CBT is the earliest center-based tree protocol and it is the simplest. CBT
was developed to address the issue of scalability: a shared tree architecture offers an
improvement in scalability over source tree architectures by a factor of the number of
active sources. Source trees scale O(S · G) because a distinct delivery tree is built per
active source. Shared trees, on the contrary, eliminate the source (S) scaling factor; all
sources use the same shared tree, and, therefore, shared trees scale O(G). Core-based
forwarding trees have a single node, for example, a router, known as the core of the tree,
from which branches emanate; these branches are made up of other routers, known as
noncore routers, that form a shortest path between a member hosts directly attached
router and the core. It should be noted that CBTs commercial deployment has been
rather limited to this juncture. Three versions have evolved. But they are not backward
compatible.1
1
Version 2 of the CBT protocol specification differs significantly from the previous version. CBT Version 2 is
not, and was not, intended to be backwards compatible with Version 1. The same is true for Version 3.
125
126 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS: CORE-BASED TREES
6.1 MOTIVATION
The Core-Based Tree Version 2 (CBTv2) network layer multicast routing protocol
builds a shared multicast distribution tree per group. CBT is intended to support inter-
and intradomain multicast routing. CBT may use a separate multicast routing table, or
it may use that of underlying unicast routing, to establish paths between senders and
receivers [RFC2189]. A CBT architecture is advantageous compared with the sour-
ce-based architecture for the following reasons [BAL199301]:
shortest path between a nonreceiver sender and the multicast tree since they
incur no tree-building overhead.
Unicast routing separation CBT formation and multicast packet flow are de-
coupled from, but take full advantage of, underlying unicast routing, irrespec-
tive of which underlying unicast algorithm is being used. All of the multicast
tree information can be derived solely from a routers existing unicast forward-
ing tables. These factors result in the CBT architecture being as robust as the
underlying unicast routing algorithm (note that with respect to IP networks CBT
requires no partition of the unicast address space). In this architecture there are
two distinct routing phases:
However, there are also weaknesses with one core-based multicast tree per group,
including the following [BAL199301]:
. Core placement and shortest path trees. In practical implementations a manual
“best guess” placement for the core router is used. However, CBTs may not
provide the most optimal paths between members of a group; this is especially
true for small, localized groups that have a nonlocal core. A dynamic core
placement mechanism may have to be used. Without good core placement,
CBTs can be inefficient; consequently, CBT has not been used as a global
multicast routing protocol [UCL200701].
. The core as a point of failure. The most obvious point of vulnerability of a CBT
is its core, whose failure can result in a tree becoming partitioned. Having
multiple cores associated with each tree solves this problem (though at the cost
of increased complexity).
Also, CBT never properly solved the problem of how to map a group address to the
address of a core. In addition, good core placement is a hard problem.
Core router
Join Join
Router 3
Join Join
Router 4
Receiver C
Router 2 Router 1
Receiver A
Source Host
Core router
Router 3
Router 4
Receiver C
Router 2 Router 1
Receiver A
Source Host
Data flow
Bidirectional shared tree
trees). When a receiver joins a multicast group, a multicast tree is built as follows (also
see Figure 6.1): the receivers local CBT router uses the multicast address to obtain the
address of the core router for the group from its table. The local CBT router then sends a
Join message for the group toward the core router. At each router on the way to the core
router, forwarding state information is created for the group; additionally, an acknowl-
edgment is sent back to the previous router.
After the bidirectional tree is built, if a sender (that happens to be a member of the
group) needs to transmit multicast data to the group, the sources local router (router 1 in
the example of Figure 6.1) forwards the packets to any of its neighbors that are on the
multicast tree (router 3 in Figure 6.1). Each router that receives a packet forwards it out of
all its interfaces that are on the tree except the one the packet came from (router 3 sends
packets to router 2 and to the core router in Figure 6.1). It is conceivable that a senders
local router is not on the tree (recall that IP multicast does not require senders to a group to
be members of the group). In this case, the datagram is forwarded to the next hop toward
the core. At some point the datagram will either reach a router that is on the tree or reach
the core; at that juncture the datagram is distributed along the multicast tree.
CBT COMPONENTS AND FUNCTIONS 129
This section looks at CBT components and functions, as described in RFC 2201.
The CBT protocol is designed to build and maintain a shared multicast distribution
tree that spans only those networks and links leading to the interested receivers. To
achieve this, a host first expresses its interest in joining a group by multicasting an IGMP
Host Membership Report (HMR) across its attached link. On receiving this report, a local
CBT aware router invokes the tree joining process (unless it has already) by generating a
JOIN_REQUEST message, which is sent to the next hop on the path toward the groups
core router (how the local router discovers which core to join is discussed later). This Join
message must be explicitly acknowledged (JOIN_ACK) either by the core router itself or
by another router that is on the unicast path between the sending router and the core,
which itself has already successfully joined the tree. Note that all CBT routers, similar to
other multicast protocol routers, are expected to participate in IGMP for the purpose of
monitoring directly attached group memberships and acting as IGMP querier when
needed.
The Join message sets up a transient join state in the routers it traverses, and this state
consists of <group, incoming interface, outgoing interface>. “Incoming interface” and
“outgoing interface” may be “previous hop” and “next hop,” respectively, if the
corresponding links do not support multicast transmission. “Previous hop” is taken
from the incoming control packets IP source address, and “next hop” is gleaned from the
routing table—the next hop to the specified core address. This transient state eventually
times out unless it is “confirmed” with a join acknowledgment (JOIN_ACK) from
upstream. The JOIN_ACK traverses the reverse path of the corresponding Join message,
which is possible due to the presence of the transient join state. Once the acknowledg-
ment reaches the router that originated the Join message, the new receiver can receive
traffic sent to the group.
Loops cannot be created in a CBT because (i) there is only one active core per group
and (ii) tree-building/maintenance scenarios that may lead to the creation of tree loops
are avoided. For example, if a routers upstream neighbor becomes unreachable, the
router immediately “flushes” all of its downstream branches, allowing them to individu-
ally rejoin if necessary. Transient unicast loops do not pose a threat because a new Join
message that loops back on itself will never get acknowledged and thus eventually times
out.
The state created in routers by the sending or receiving of a JOIN_ACK is
bidirectional—data can flow either way along a tree “branch,” and the state is group
specific—it consists of the group address and a list of local interfaces over which Join
messages for the group have previously been acknowledged. There is no concept of
“incoming” or “outgoing” interfaces, though it is necessary to be able to distinguish the
upstream interface from any downstream interfaces. In CBT, these interfaces are known
as the “parent” and “child” interfaces, respectively.
With regards to the information contained in the multicast forwarding cache, on link
types not supporting native multicast transmission an on-tree router must store the
2
This section is based directly on RFC 2201.
130 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS: CORE-BASED TREES
address of a parent and any children. On links supporting multicast, however, parent and
any child information is represented with local interface addresses (or similar identifying
information, such as an interface “index”) over which the parent or child is reachable.
When a multicast data packet arrives at a router, the router uses the group address as
an index into the multicast forwarding cache. A copy of the incoming multicast data
packet is forwarded over each interface (or to each address) listed in the entry except the
incoming interface.
Each router that comprises a CBT multicast, except the core router, is responsible for
maintaining its upstream link, provided it has interested downstream receivers, that is, the
child interface list is not NULL. A child interface is one over which a member host is
directly attached or one over which a downstream on-tree router is attached. This “tree
maintenance” is achieved by each downstream router periodically sending a “Keepalive”
message (ECHO_REQUEST) to its upstream neighbor, that is, its parent router on the tree.
One Keepalive message is sent to represent entries with the same parent, thereby
improving scalability on links that are shared by many groups. On multicast-capable
links, a keepalive is multicast to the “all-cbt-routers” group (IANA assigned as
224.0.0.15); this has a suppressing effect on any other router for which the link is its
parent link. If a parent link does not support multicast transmission, keepalives are unicast.
The receipt of a Keepalive message over a valid child interface immediately prompts
a response (ECHO_REPLY), which is either unicast or multicast, as appropriate. The
ECHO_REQUEST does not contain any group information; the ECHO_REPLY does,
but only periodically. To maintain consistent information between parent and child, the
parent periodically reports, in an ECHO_REPLY, all groups for which it has a state, over
each of its child interfaces for those groups. This group-carrying echo reply is not
prompted explicitly by the receipt of an Echo Request message. A child is notified of the
time to expect the next Echo Reply message containing group information in an echo
reply prompted by a childs echo request. The frequency of parent group reporting is at
the granularity of minutes.
It cannot be assumed all of the routers on a multiaccess link have a uniform view of
unicast routing; this is particularly the case when a multiaccess link spans two or more
unicast routing domains. This could lead to multiple upstream tree branches being
formed (an error condition) unless steps are taken to ensure all routers on the link agree,
which is the upstream router for a particular group. CBT routers attached to a multiaccess
link participate in an explicit election mechanism that elects a single router, the DR, as
the links upstream router for all groups. Since the DR might not be the links best next
hop for a particular core router, this may result in Join messages being redirected back
across a multiaccess link. If this happens, the redirected Join message is unicast across
the link by the DR to the best next hop, thereby preventing a looping scenario. This
redirection only ever applies to Join messages. While this is suboptimal for Join
messages, which are generated infrequently, multicast data never traverses a link more
than once (either natively or encapsulated).
In all but the exception case described above, all CBT control messages are multicast
over multicast supporting links to the “all-cbt-routers” group, with IP TTL 1. When a
CBT control message is sent over a nonmulticast supporting link, it is explicitly
addressed to the appropriate next hop.
CORE ROUTER DISCOVERY 131
This section looks at CBT core router discovery, as described in RFC 2201.
Core router discovery is the most difficult aspect of shared tree multicast
architectures, particularly in the context of Interdomain Multicast Routing (IDMR).
There have been a number of proposals over the years, including advertising core
addresses in a multicast session directory, manual placement, and the Hierarchical
Protocol Independent Multicast (HPIM) approach of strictly dividing up the multicast
address space into many “hierarchical scopes” and using explicit advertising of core
routers between scope levels.
Two options for CBTv2 core discovery are the “bootstrap” mechanism and manual
placement. The bootstrap mechanism (as specified with the PIM SM protocol) is
applicable only to intradomain core discovery and allows for a “plug-and-play” type
operation with minimal configuration. The disadvantage of the bootstrap mechanism is
3
This section is based directly on RFC 2201.
132 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS: CORE-BASED TREES
that it is much more difficult to affect the shape, and thus optimality, of the resulting
distribution tree. Also, it must be implemented by all CBT routers within a domain. It is
unlikely at this stage that the bootstrap mechanism will be appended to a well-known
network layer protocol, such as IGMP or ICMP, though this would facilitate its
ubiquitous (intradomain) deployment. Therefore, each multicast routing protocol re-
quiring the bootstrap mechanism must implement it as part of the multicast routing
protocol itself.
Manual configuration of leaf routers with <core, group> mappings is the other
option (note: leaf routers only); this imposes a degree of administrative burden: the
mapping for a particular group must be coordinated across all leaf routers to ensure
consistency. Hence, this method does not scale particularly well. However, it is likely that
“better” trees will result from this method, and it is also the only available option for
interdomain core discovery currently available. A summary of the operation of the
bootstrap mechanism follows (the topic is also revisited in Section 6.5.11). It is assumed
that all routers within the domain implement the “bootstrap” protocol, or at least forward
bootstrap protocol messages.
A subset of the domains routers are configured to be CBT candidate core routers.
Each candidate core router periodically (default every 60 s) advertises itself to the
domains BSR using Core Advertisement messages. The BSR is itself elected dynami-
cally from all (or participating) routers in the domain. The domains elected BSR collects
Core Advertisement messages from candidate core routers and periodically advertises a
Candidate Core set (CC-set) to each other router in the domain using traditional
hop-by-hop unicast forwarding. The BSR uses Bootstrap messages to advertise the
CC-set. Together, Core Advertisement and Bootstrap messages comprise the bootstrap
protocol.
When a router receives an IGMP host membership report from one of its directly
attached hosts, the local router uses a hash function on the reported group address, the
result of which is used as an index into the CC-set. This is how local routers discover
which core to use for a particular group.
Note the hash function is specifically tailored such that a small number of
consecutive groups always hash to the same core. Furthermore, Bootstrap messages
can carry a “group mask,” potentially limiting a CC-set to a particular range of groups.
This can help reduce traffic concentration at the core.
If a BSR detects a particular core as being unreachable (it has not announced its
availability within some period), it deletes the relevant core from the CC-set sent in its
next Bootstrap message. This is how a local router discovers a groups core is unreach-
able; the router must rehash for each affected group and join the new core after
removing the old state. The removal of the “old” state follows the sending of a
QUIT_NOTIFICATION upstream and a FLUSH_TREE message downstream.
In this section, details of the CBT protocol are presented in the context of a single router
implementation based on RFC 2189.
PROTOCOL SPECIFICATION DETAILS 133
6.5.1.1 Sending HELLOs. When a router starts up, it multicasts two HELLO
messages over each of its broadcast interfaces in succession. The DR flag is initially
unset (FALSE) on each broadcast interface. This avoids the situation in which each
router on a multiaccess subnet believes it is the DR, thus preventing the multiple
forwarding of Join requests should they arrive during this startup period. If no “better”
HELLO message is received after holdtime seconds, the router assumes the role of DR
on the corresponding interface. A router sends a HELLO message whenever its
[HELLO_INTERVAL] expires. Whenever a router sends a HELLO message, it resets
its hello timer.
set between zero (0) and [HOLDTIME] seconds when the lesser preferenced HELLO
message is received.
If a JOIN_REQUEST for the same group is scheduled to be sent over the corre-
sponding interface (i.e., awaiting a timer expiry), the JOIN_REQUEST is unscheduled.
If this router has a cache-deletion timer [CACHE_DEL_TIMER] running on the
arrival interface for the group specified in a multicast join, the timer is cancelled.
6.5.3.1 Sending JOIN_ACKs. The JOIN_ACK is sent over the same interface as
the corresponding JOIN_REQUESTwas received. The sending of the acknowledgement
causes the router to add the interface to its child interface list in its forwarding cache for
the group, if it is not already. A JOIN_ACK is multicast or unicast according to whether
the outgoing interface supports multicast transmission or not.
sending over the same interface; the scheduled interval is between 0 (zero) and holdtime
seconds. This message is multicast to the “all-cbt-routers” group over multicast-capable
interfaces and unicast otherwise.
If a multicast ECHO_REQUEST message arrives via any valid parent interface,
the router resets its [ECHO_INTERVAL] timer for that upstream interface,
thereby suppressing the sending of its own ECHO_REQUEST over that upstream
interface.
discarded. The Flush message must be forwarded over each child interface for the
specified group. Once the Flush message has been forwarded, all of the state for the group
is removed from the routers forwarding cache.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| vers | type | addr len | checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.2. CBT Common Control Packet Header
6.5.10.1 CBT Common Control Packet Header. All CBT control messages
have a common fixed-length header, as shown in Figure 6.2.
This CBT specification is Version 2.
CBT packet types are:
. Type 0: HELLO
. Type 1: JOIN_REQUEST
. Type 2: JOIN_ACK
. Type 3: QUIT_NOTIFICATION
. Type 4: ECHO_REQUEST
. Type 5: ECHO_REPLY
. Type 6: FLUSH_TREE
. Type 7: Bootstrap message (optional)
. Type 8: Candidate Core Advertisement (optional)
. Address length: address length in bytes of unicast or multicast addresses carried in
the control packet.
. Checksum: the 16-bit ones complement of the ones complement sum of the
entire CBT control packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Preference | option type | option len | option value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.3. HELLO Packet Format
140 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS: CORE-BASED TREES
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| target router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| originating router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| option type | option len | option value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.4. JOIN_REQUEST Packet Format
option length 0 (zero). This option type is used with HELLO messages sent by a
BR as part of designated BR election.
. Option length: length of the Option Value field in bytes.
. Option value: variable-length field carrying the option value.
. Group address: multicast group address of the group being joined. For a
“wildcard” join this field contains the value of INADDR_ANY.
. Target router: target (core) router for the group.
. Originating router: router that originated this JOIN_REQUEST.
. Option type, option length, and option value: see HELLO packet format.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| target router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| option type | option len | option value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.5. JOIN_ACK Packet Format
. Originating child router: address of the router that originates the QUIT_
NOTIFICATION.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| originating child router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.6. QUIT_NOTIFICATION Packet Format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| originating child router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.7. ECHO_REQUEST Packet Format
142 MULTICAST ROUTING—SPARSE-MODE PROTOCOLS: CORE-BASED TREES
. Originating child router: address of the router that originates the ECHO_
REQUEST.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| originating parent router |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address #1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address #2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address #n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.8. ECHO_REPLY Packet Format
PROTOCOL SPECIFICATION DETAILS 143
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (Control Packet Header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| group address #n |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.9. FLUSH_TREE Packet Format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT (common control packet header) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| For full Bootstrap Message specification, see [7] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6.10. Bootstrap Message Format
result of which is used as an index into the CC-set. This is how local routers discover
which core to use for a particular group.
Note the hash function is specifically tailored such that a small number of
consecutive groups always hash to the same core. Furthermore, Bootstrap messages
can carry a “group mask,” potentially limiting a CC-set to a particular range of groups.
This can help reduce traffic concentration at the core.
If a BSR detects a particular core as being unreachable (it has not announced its
availability within some period), it deletes the relevant core from the CC-set sent in its
next Bootstrap message. This is how a local router discovers a groups core is
unreachable; the router must rehash for each affected group and join the new core
after removing the old state. The removal of the “old” state follows the sending of a
QUIT_NOTIFICATION upstream and a FLUSH_TREE message downstream.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CBT common control packet header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| For full Candidate Core Adv. Message specification, see [7] |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The CBT Version 3 specification published as an Internet Draft in 1988 superceded and
obsoleted RFC 2189. Changes from RFC 2189 include support for source-specific
joining and pruning to provide better CBT transit domain capability, new packet formats,
and new robustness features. Unfortunately, most of these changes are not backward
compatible with RFC 2189; however, neither at that time nor now has CBT enjoyed
widespread implementation or deployment. Specifically, changes from RFC 2189 are
[BAL199801]:
. Forwarding cache support for entries of different granularities, that is, (*, G),
(*, Core), or (S, G) and support for S and/or G masks for representing S and/or
G aggregates
. Included support for joins, quits (prunes), and flushes of different granularities,
that is, (*, G), (*, Core), or (S, G), where S and/or G can be aggregated
. Optional one-way join capability
. Improved the LAN HELLO protocol and included a state diagram
. Revised packet format and provided option support for all control packets
. Added downstream state timeout to CBT router
. Revised the CBT “keepalive” mechanism between adjacent on-tree CBT routers
. Overall provided added clarification of protocol events and mechanisms
received via a pruned interface but must not be forwarded over a pruned interface.
Prune state can also be instantiated by the QUIT_NOTIFICATION message (see
Section 6.6.8).
A JOIN_REQUEST is made unidirectional by the inclusion of the “unidirectional”
JOIN option that is copied to the corresponding JOIN_REQUEST options are always
copied to the corresponding JOIN_ACK.
CBT now supports source-specific joins/prunes so as to be better equipped when
deployed in a transit domain; source-specific control messages are only ever
generated by CBT BRs. Source-specific control messages follow G, not S, that is,
they are routed toward the core (not S) and no further. Thus, the (S, G) state only
exists on the “core tree” in a CBT domain—those routers and links between a BR and
a core router.
A router is not considered “on tree” until it has received a JOIN_ACK for a
previously sent/forwarded JOIN_REQUEST and has instantiated the relevant forward-
ing state.
Loops cannot be created in a CBT because (a) there is only one active core per group
and (b) tree-building/maintenance scenarios that may lead to the creation of tree loops are
avoided. For example, if a routers parent router for a group becomes unreachable, the
router (child) immediately “flushes” all of its downstream branches, allowing them to
individually rejoin if necessary. Transient unicast loops do not pose a threat because a
new Join message that loops back on itself will never get acknowledged and thus
eventually times out.
is known as the routers Shared Forwarding Cache, or SFC. By having all CBT
implementations support an SFC, any CBT router is eligible to become a BR.
(*, Core) entries are only relevant to a CBT PFC. This state is represented in the
cache by specifying the cores IP unicast address in place of a group address/group
address range.
With respect to representing groups (Gs) in the forwarding cache, G may be an
individual Class D 32-bit group address or a prefix representing a contiguous range of
group addresses (a group aggregate). Similarly, for source-specific PFC entries, S can be
an aggregate. Therefore, the PFC should support the inclusion of masks or mask lengths
to be associated with each of S and G.
In CBT, all PFC entries require that an entrys “upstream” interface is distinguish-
able as such—how is implementation dependent. CBT uses the term “parent” inter-
changeably with “upstream” and “child/children” interchangeably with “downstream.”
A core routers parent is always NULL.
Whenever the sending/receiving of a CBT join or prune results in the instantiation of
a more specific state in the router [e.g., a (*, Core) state exists, then a (*, G) join arrives],
the children of the new entry represent the union of the children from all other less
specific forwarding cache entries as well as the child (interface) over which the message
was received (if not already included). This is so that at most a single forwarding cache
entry needs to be matched with an incoming packet.
Note that in CBT there is no notion of “expected” or “incoming” interface for (S, G)
forwarding entries—these are treated just like (*, G) entries.
A forwarding cache entry whose children are all marked as pruned as a result of
receiving Quit messages may delete the entry provided there exists no less specific state
with at least one nonpruned child.
and ECHO_REPLY messages, with the child routers responsible for periodically
(explicitly) querying the parent router. The parent router (implicitly) monitors its
children by expecting to periodically receive queries (ECHO_REQUESTs) from each
child (per child router on nonbroadcast networks; per child interface on broadcast
networks). The repeated absence of either an expected query (ECHO_REQUEST) or
expected response (ECHO_REPLY) results in the corresponding interface being marked
as pruned in the routers forwarding cache. This constitutes a state timeout due to an
exception condition. An interface can also be pruned in an explicit and timely fashion by
means of either a QUIT_NOTIFICATION (downstream to upstream) or FLUSH_TREE
(upstream to downstream) message.
Note that the network path comprising a CBT branch only changes due to
connectivity failure. An implementation could, however, invoke the tearing down and
rebuilding of a tree branch whenever an underlying routing change occurs, irrespective of
whether that change is due to connectivity failure. This is not CBTs default behavior.
tree loops. The router that actually forwards a join off-LAN for a group (toward the
groups core) is known as the LAN “upstream router” for that group. A groups LAN
upstream router may or may not be the LAN DR.
With regards to a JOIN_REQUEST being multicast onto a broadcast LAN, the LAN
DR decides over which interface to forward it. Depending on the groups core location, the
DR may redirect (unicast) the join back across the same link as it arrived towhat it considers
isthebestnexthoptowardthecore.Inthiscase,theLANDRdoesnotkeepanytransientstate
for the JOIN_REQUEST it passed on. This best next-hop router is then the LAN upstream
forwarder for the corresponding group. This redirection only applies to joins, which are
relatively infrequent—native multicast data never traverses a link more than once.
For the case where a DR originates a join and has to unicast it to a LAN neighbor, the
DR must keep a transient state for the join.
On broadcast LANs it is necessary for a router to be able to distinguish between a
directly attached (downstream) group member and any (at least one) downstream on-tree
router(s). For a router to be able to send a QUIT_NOTIFICATION (prune) upstream, it
must be sure it neither has any (downstream) directly attached group members or on-tree
routers reachable via a downstream interface. How this is achieved is implementation
dependent. One possible way would be for a CBT forwarding cache to maintain two extra
bits for each child entry—one bit to indicate the presence of a group member on that
interface, the other bit indicating the presence of an on-tree router on that interface. Both
these bits must be clear (i.e., unset) before this router can send a QUIT_NOTIFICATION
for the corresponding state upstream.
REFERENCES
7.1 OVERVIEW
As we know by now, PIM employs an explicit join model for sparse groups; transmission
occurs on a shared tree but it can switch to a per-source tree. PIM DM also uses the
152
BASIC PIM DM BEHAVIOR 153
underlying unicast routing information base to flood multicast datagrams to all multicast
routers (it does not have a topology discovery mechanism often used by a unicast routing
protocol). Prune messages are used to prevent messages from propagating to routers
without active group members. PIM DM uses RPF. PIM DM employs the same packet
formats PIM SM uses. Note that:
. In PIM DM there are no periodic joins being transmitted, only explicitly triggered
prunes and grafts.
. In PIM DM there is no RP. This is advantageous in networks that cannot accept a
single point of failure, such as commercial IPTV networks.
. PIM DM does not maintain a keepalive timer associated with each (S,G) route
(unlike the case with PIM SM). In PIM DM, route and state information associated
with an (S,G) entry is maintained as long as any protocol timer associated with that
(S,G) entry is active. Thereafter, all information concerning that (S,G) route is
discarded.
PIM DM makes the assumption that when a source S starts transmitting, all downstream
users wish to receive multicast datagrams. Initially, the multicast datagrams are flooded
to all areas of the network. PIM DM uses RPF to prevent looping of multicast datagrams
while flooding. If some areas of the network do not have group members, PIM DM will
prune off the forwarding branch by instantiating a prune state. In IPTV applications, all
routers supporting a neighborhood DSL Access Multiplexer (DSLAM) will (typically)
require to be receiving all multicast packets (with each content channel using a different
multicast address); hence, it is unlikely that pruning will occur, but that may be the case in
the middle of the night if nobody is watching TV.
154 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
Source1 Host
A
Prune
message
A
B C
Receiver
Host
Receiver Host
Receiver
Host
Host
Before
After
A prune state has a finite lifetime; when that lifetime expires, data will again be
forwarded down the previously pruned branch. The broadcast of datagrams followed
by the pruning of unwanted branches is referred to as a flood-and-prune cycle. The
prune state is associated with an (S,G) pair. When a new member for a group G appears
in a pruned area, a router can “graft” toward the source S for the group, thereby
activating the pruned branch back into a forwarding branch. To minimize the repeated
flooding and pruning associated with a particular (S,G) pair, PIM DM uses a State
Refresh message. This message is sent by the router(s) directly connected to the source
and is propagated throughout the network; when the message is received by a router on
its RPF interface, the State Refresh message causes an existing prune state to be
refreshed [RFC3973].
PROTOCOL SPECIFICATION 155
PIM DM has a simplified design compared with multicast routing protocols with
built-in topology discovery mechanisms (e.g., DVMRP) and is not dependent on any
specific topology discovery protocol. However, this simplification does incur more
potential overhead in some applications by causing flooding and pruning activities to
take place on some links that could be avoided if sufficient topology information were
available. In IPTV applications, increased overhead is opted for in favor of the
simplification and flexibility obtained by not relying on a specific topology discovery
protocol.
1
The rest of this chapter is liberally based on RFC 3973.
156 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
7.3.1.2 (S,G) State. For every source/group pair (S,G), a router stores the state
shown in Table 7.3.
the relevant state. The macros pim_include(*,G) and pim_include(S,G) indicate the
interfaces to which traffic might or might not be forwarded because of hosts that are local
members on those interfaces.
The sets defined in Table 7.5 are used in the descriptions of the state machines.
7.3.3.3 Hello Message Hold Time. The hold time in the Hello message
should be set to a value that can reasonably be expected to keep the hello active until a
new Hello message is received. On most links, this will be 3.5 times the value of
Hello_Period.
If the hold time is set to 0xFFFF, the receiving router must not time out that Hello
message. This feature might be used for on-demand links to avoid keeping the link up
with periodic Hello messages. If a hold time of 0 is received, the corresponding neighbor
state expires immediately. When a PIM router takes an interface down or changes the IP
address, a Hello message with a zero hold time should be sent immediately (with the old
IP address if the IP address is changed) to cause any PIM neighbors to remove the old
information immediately.
are sent. If the neighbor is downstream, the router may replay the last State
Refresh message for any (S,G) pairs for which it is the assert winner indicating
prune and assert status to the downstream router. These State Refresh messages
should be sent out immediately after the Hello message. If the neighbor is the
upstream neighbor for an (S,G) entry, the router may cancel its prune limit timer to
permit sending a Prune message and reestablishing a pruned state in the upstream
router.
Upon startup, a router may use any State Refresh messages received within
Hello_Period of its first Hello message on an interface to establish state information.
The state refresh source will be the RPF’(S), and the prune status for all interfaces
will be set according to the prune indicator bit in the State Refresh message. If the
prune indicator is set, the router should set the PruneLimitTimer to Prune_Holdtime
and set the PruneTimer on all downstream interfaces to the state refresh’s interval
times 2. The router should then propagate the state refresh as described in
Section 7.3.5.1.
olist == NULL
Forward Pruned
Rcv GraftAck OR
Rcv State Refresh with (P==0) OR olist != NULL
AckPending
S Directly Connect
response to B’s prune to override the prune. This is the only situation in PIM DM in
which a Join message is used. Finally, a Graft message is used to rejoin a previously
pruned branch to the delivery tree.
Previous State
Event Forwarding Pruned AckPending
olist(S,G) >P Send N/A >P Send
>NULL Prune(S,G) Prune(S,G)
Set PLT(S,G) Set PLT
(S,G)
Cancel
GRT
(S,G)
olist(S,G) N/A >AP Send N/A
>nonNULL Graft(S,G)
Set GRT(S,G)
RPF’(S) >AP Send >AP Send >AP Send
Changes AND
olist(S,G) Graft(S,G) Graft(S,G) Graft(S,G)
!¼ NULL Set GRT(S,G) Set GRT Set GRT(S,G)
(S,G)
RPF’(S) >P >P Cancel >P Cancel
Changes PLT(S,G) GRT(S,G)
AND olist
(S,G) ¼¼ NULL
S becomes >F >P >F Cancel
directly GRT(S,G)
connected
GRT(S,G) N/A N/A >AP Send
Expires Graft(S,G)
Set
GRT(S,G)
Receive >F >P >F Cancel
GraftAck(S,G) GRT(S,G)
from RPF’(S)
TRANSITIONS FROM THE FORWARDING (F) STATE. When the Upstream(S,G) state machine is in
the F state, the events shown in Table 7.8 may trigger a transition.
TRANSITIONS FROM THE PRUNED (P) STATE. When the Upstream(S,G) state machine is in the P
state, the events shown in Table 7.9 may trigger a transition.
TRANSITIONS FRO THE ACKPENDING(AP) STATE. When the Upstream(S,G) state machine is in
the AP state, the events shown in Table 7.10 may trigger a transition.
164 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
Event Description
AND S NOT directly state machine must transition to the P state.
connected The GRT(S,G) must be cancelled.
S becomes directly Unicast routing has changed so that S is directly connected.
connected The graft retry timer must be cancelled, and the
Upstream(S,G) state machine must transition to the
F state.
GRT(S,G) expires The GRT(S,G) expires for this (S,G) entry. The Upstream(S,G)
state machine stays in the AP state. Another Graft message
for (S,G) should be unicast to RPF’(S) and the GRT(S,G)
reset to Graft_Retry_Period. It is recommended that the
router retry a configured number of times before ceasing retries.
See GraftAck(S,G) A GraftAck is received from RPF’(S). The graft retry timer
from RPF’(S) must be cancelled, and the Upstream(S,G) state machine
must transition to the F state.
PPT Expires
PrunePending Pruned
Rcv Prune
field does not match this router’s address on I, then these state transitions in this state
machine must not occur.
TRANSITIONS FROM THE NOINFO STATE. When the Prune(S,G) downstream state machine is in
the NI state, the events identified in Table 7.13 may trigger a transition.
TRANSITIONS FROM THE PRUNEPENDING (PP) STATE. When the Prune(S,G) downstream state
machine is in the PP state, the events identified in Table 7.14 may trigger a transition.
TRANSITIONS FROM THE PRUNE (P) STATE. When the Prune(S,G) downstream
state machine is in the P state, the events identified in Table 7.15 may trigger a
transition.
if (I contained in prunes(S,G)) {
set Prune Indicator bit of SRM’ to 1;
if StateRefreshCapable(I) == TRUE
set PT(S,G) to largest active holdtime read from a Prune
message accepted on I;
} else {
set Prune Indicator bit of SRM’ to 0;
}
if (AssertState == NoInfo) {
set Assert Override of SRM’ to 1;
} else {
set Assert Override of SRM’ to 0;
}
transmit SRM’ on I;
}
The pseudocode above employs the macro definitions listed in Table 7.16.
PROTOCOL SPECIFICATION 173
NotOriginator Originator
SAT expires or
S not direct connect
Figure 7.4. State Refresh State Machine
174 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
Source Active Timer This timer is first set when the Origination(S,G) state machine
(SAT(S,G)) transitions to the O state and is reset on the receipt of every
data packet from S addressed to group G. When it expires,
the Origination(S,G) state machine transitions to the NO state.
This timer is normally set to SourceLifetime.
TRANSITIONS FROM THE NOTORIGINATOR (NO) STATE. When the Originating(S,G) state
machine is in the NO state, the event shown in Table 7.19 may trigger a transition.
TRANSITIONS FROM THE ORIGINATOR (O) STATE. When the Origination(S,G) state machine is
in the O state, the events shown in Table 7.20 may trigger a transition.
PROTOCOL SPECIFICATION 175
struct assert_metric {
metric_preference;
route_metric;
ip_address;
};
assert_metric
my_assert_metric(S,G,I) {
if (CouldAssert(S,G,I) == TRUE) {
return spt_assert_metric(S,G,I)
}else {
return infinite_assert_metric()
}
}
spt_assert_metric(S,I) gives the assert metric that we use if one is sending an assert
based on the following active (S,G) forwarding state:
assert_metric
spt_assert_metric(S,I) {
return {0,MRIB.pref(S),MRIB.metric(S),my_addr(I)}
}
MRIB.pref(X) and MRIB.metric(X) are the routing preference and routing metrics
associated with the route to a particular (unicast) destination X, as determined by the
MRIB.; my_addr(I) is simply the router’s network (e.g., IP) address associated with the
local interface I.
infinite_assert_metric() gives the assert metric that one needs to send an assert but
does not match the (S,G) forwarding state:
assert_metric
infinite_assert_metric() {
return {1,infinity,infinity,0}
}
7.3.6.3 Assert State Macros. The macro lost_assert(S,G,I) is used in the olist
computations of state summarization nomenclature and is defined as follows:
bool lost_assert(S,G,I) {
if ( RPF_interface(S) == I ) {
return FALSE
}else {
return (AssertWinner(S,G,I) != me AND
(AssertWinnerMetric(S,G,I) is better than
spt_assert_metric(S,G,I)))
}
}
7.3.6.4 (S,G) Assert Message State Machine. The (S,G) assert state
machine for interface I is shown in Figure 7.5. There are three states as depicted in
Table 7.21.
In addition, an Assert Timer (AT(S,G,I)) is used to time out the assert state.
Winner Loser
Terminology: A “preferred assert” is one with a better metric than the current winner.
An “inferior assert” is one with a worse metric than my_assert_metric(S,G,I).
The state machine uses the following macro:
CouldAssert(S,G,I) = (RPF_interface(S) != I)
TRANSITIONS FROM NI STATE. In the NI state, the events shown in Table 7.22 may trigger
transitions.
TRANSITIONS FROM WINNER STATE. When in “I am Assert Winner” state, the events shown in
Table 7.23 trigger transitions.
TRANSITIONS FROM LOSER STATE. When in “I am Assert Loser” state, the transitions shown in
Table 7.24 can occur.
7.3.6.5 Rationale for Assert Rules. The following is a summary of the rules
for generating and processing Assert messages. It is not intended to be definitive (the
180 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
T A B L E 7.23. Events That Trigger a Transition from the “I am Assert Winner” State
Event Description
An (S,G) data packet An (S,G) data packet arrived on a downstream interface.
arrives on downstream The assert state machine remains in the “I am Assert
interface I Winner” state. The router must send an Assert(S,G) to
interface I and set the Assert Timer (AT(S,G,I)) to
Assert_Time.
Receive Inferior Assert An (S,G) assert is received containing a metric for S that
or State Refresh is worse than this router’s metric for S. Whoever sent the
assert is in error. The router must send an Assert(S,G) to
interface I and reset AT(S,G,I) to Assert_Time.
Receive Preferred Assert An (S,G) assert or state refresh is received that has a
or State Refresh better metric than this router’s metric for S on interface
I. The assert state machine must transition to “I am
Assert Loser” state and store the new assert winner’s
address and metric. If the metric was received in an
assert, the router must set AT(S,G,I) to Assert_Time. If
the metric was received in a state refresh, the router
must set AT(S,G,I) to three times the state refresh
interval. The router must also multicast a Prune(S,G) to
the assert winner, with a prune holdtime equal to the
assert timer, and evaluate any changes in its
Upstream(S,G) state machine.
Send State Refresh The router is sending a State Refresh(S,G) message on
interface I. The router must set AT(S,G,I) to three
times the state refresh interval contained in the State
Refresh(S,G) message.
AT(S,G,I) Expires The (S,G) AT(S,G,I) expires. The assert state
machine must transition to the NI state.
CouldAssert(S,G,I) -> This router’s RPF interface changed, making
FALSE CouldAssert(S,G,I) false. This router can no longer
perform the actions of the Assert winner, so the assert
state machine must transition to NI state, send an
AssertCancel(S,G) to interface I, cancel the AT(S,G,I),
and remove itself as the assert winner.
state machines and pseudocode provide the definitive behavior). Instead, it provides
some rationale for the behavior.
1. The assert winner for (S,G) must act as the local forwarder for (S,G) on behalf of
all downstream members.
2. PIM messages are directed to the RPF’ neighbor and not to the regular RPF
neighbor.
PROTOCOL SPECIFICATION 181
T A B L E 7.24. Events That Trigger a Transition from the “I am Assert Loser” State
Event Description
Receive Inferior An assert or state refresh is received from the current Assert
Assert or State winner that is worse than this router’s metric for S (typically, the
Refresh from winner’s metric became worse). The assert state machine must
Current Winner transition to NI state and cancel the Assert Timer (AT(S,G,I)).
The router must delete the previous assert winner’s address
and metric and evaluate any possible transitions to its
Upstream(S,G) state machine. Usually this router will
eventually reassert and win when data packets from S have
started flowing again.
Receive Preferred An assert or state refresh is received that has a metric better than
Assert or State or equal to that of the current assert winner. The assert state
Refresh machine remains in the Loser (L) state. If the metric was received
in an assert, the router must set the AT(S,G,I) to Assert_Time.
If the metric was received in a state refresh, the router must set
the AT(S,G,I) to three times the received state refresh interval.
If the metric is better than the current assert winner, the router
must store the address and metric of the new assert winner,
and if CouldAssert(S,G,I) (b) V. Mirsa and D. E. Draper,
Biopolymers 48, 113(1998). See also ref. 71 and 72.¼¼ TRUE,
the router must multicast a Prune(S,G) to the new assert winner.
AT(S,G,I) expires The (S,G) AT(S,G,I) expires. The assert state machine must
transition to NI state. The router must delete the assert winner’s
address and metric. If CouldAssert ¼¼ TRUE, the router must
evaluate any possible transitions to its Upstream(S,G)
state machine.
CouldAssert > CouldAssert has become FALSE because interface I has become
FALSE the RPF interface for S. The assert state machine must transition
to NI state, cancel AT(S,G,I), and delete information
concerning the assert winner on I.
CouldAssert > CouldAssert has become TRUE because interface I used to be the
TRUE RPF interface for S, and now it is not. The assert state machine
must transition to NI state, cancel AT(S,G,I), and delete
information concerning the assert winner on I.
Current Assert The current assert winner’s NeighborLiveness Timer (NLT(N,I))
Winner’s has expired. The Assert state machine must transition to the
NeighborLiveness NI state, delete the assert winner’s address and metric,
Timer Expires and evaluate any possible transitions to its Upstream(S,G)
state machine
Receive Prune(S,G), A Prune(S,G), Join(S,G), or Graft(S,G) message was received
Join(S,G), or on interface I with its upstream neighbor address set to the
Graft(S,G) router’s address on I. The router must send an Assert(S,G)
on the receiving interface I to initiate an assert negotiation.
The assert state machine remains in the assert L state. If a
Graft(S,G) was received, the router must respond with a
GraftAck(S,G).
182 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
7.3.7.1 PIM Header. As discussed in Chapter 5, all PIM Control messages have
the header shown in Figure 7.6.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
7.3.7.5 Graft Message Format. PIM Graft messages use the same format as
Join/Prune messages, except that the Type field is set to 6. The source address must be in
the Join section of the message. The Holdtime field should be zero and should be ignored
when a graft is received.
7.3.7.7 State Refresh Message Format. PIM State Refresh messages have
the format shown in Figure 7.7.
PIM Ver, Type, Reserved, Checksum: Described in PIM SM (see Chapter 5).
Multicast Group Address: The multicast group address in the encoded multicast
address format given in PIM SM (see Chapter 5).
Source Address: The address of the data source in the encoded unicast address
format given in PIM SM (see Chapter 5).
Originator Address: The address of the first hop router in the encoded unicast
address format given in PIM SM (see Chapter 5).
R: The RP-tree bit. Set to 0 for PIM DM. Ignored upon receipt.
Metric Preference: The preference value assigned to the unicast routing protocol that
provided the route to the source.
Metric: The cost metric of the unicast route to the source. The metric is in units
applicable to the unicast routing protocol used.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|PIM Ver| Type | Reserved | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast Group Address (Encoded Group Format) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address (Encoded Unicast Format) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Originator Address (Encoded Unicast Format) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R| Metric Preference |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Metric |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Masklen | TTL |P|N|O|Reserved | Interval |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7.7. Refresh Message
184 MULTICAST ROUTING—DENSE-MODE PROTOCOLS: PIM DM
Masklen: The length of the address mask of the unicast route to the source.
TTL: Time to Live of the State Refresh message. Decremented each time the
message is forwarded. Note that this is different from the IP header TTL, which is always
set to 1.
P: Prune indicator flag. This must be set to 1 if the state refresh is to be sent on a
pruned interface. Otherwise, it must be set to 0.
N: Prune Now flag. This should be set to 1 by the state refresh originator on every
third State Refresh message and should be ignored upon receipt. This is for compatibility
with earlier versions of state refresh.
O: Assert Override flag. This should be set to 1 by upstream routers on a LAN if the
Assert Timer (AT(S,G)) is not running and should be ignored upon receipt. This is for
compatibility with earlier versions of state refresh.
Reserved: Set to zero and ignored upon receipt.
Interval: Set by the originating router to the interval (in seconds) between consecu-
tive State Refresh messages for this (S,G) pair.
REFERENCES
[PAR200601] L. Parziale,W. Liu, et al., TCP/IP Tutorial and Technical Overview, IBM Press,
Redbook Abstract, 2006, IBM Form Number GG24-3376-07, 2006.
[RFC3973] RFC 3973, Protocol Independent Multicast—Dense-Mode (PIM—DM): Protocol
Specification (Revised), J. Nicholas, W. Siadak, January 2005.
8
OTHER DENSE-MODE
MULTICAST ROUTING
PROTOCOLS: DVMRP
AND MOSPF
This chapter provides a short overview of other DM MRPs not covered so far, specifically
Distance Vector Multicast Routing Protocol (DVMRP) and Multicast Open Shortest Path
First (MOSPF). These protocols are generally not used in current IPTV/DVB-H
applications, but MOSPF may have applicability at some future point in time. Portions
of this discussion are based on the pertinent RFCs.
8.1.1 Overview
In previous chapters we discussed the processes typically used for the creation of
source-based multicast trees. As noted, these distribution trees can be built by
185
186 OTHER DENSE-MODE MULTICAST ROUTING PROTOCOLS: DVMRP AND MOSPF
information present in the underlying unicast routing table (as is the case with
PIM DM) or
. a link-state algorithm (as is the case with MOSPF, also discussed below).
The distance vector multicast algorithm builds a multicast delivery tree using a
variant of the RPF technique. In broad terms, the technique is as follows: when a
multicast router receives a multicast data packet, if the packet arrives on the interface
used to reach the source of the packet, the packet is forwarded over all outgoing
interfaces, except leaf subnets with no members attached; a “leaf” subnet is a subnet that
no router would use to reach the source of a multicast packet. If the data packet does not
arrive over the link that would be used to reach the source, then the packet is discarded.
This constitutes a “broadcast-and-prune” approach to multicast tree construction: when a
data packet reaches a leaf router, if that router has no membership registered on any of its
directly attached subnetworks, the router sends a Prune message one hop back toward the
source. The receiving router then checks its leaf subnets for group membership and
checks whether it has received a prune from all of its downstream routers (downstream
with respect to the source). If so, the router itself can send a Prune message upstream over
the interface leading to the source. The sender and receiver of a Prune message must
cache the <S, G> pair being reported for a “lifetime,” typically of minutes. Unless a
routers prune information is refreshed by the receipt of a new prune for <source, group>
before its “lifetime” expires, that information is removed, allowing data to flow over the
branch again. The state that expires in this way is referred to as of “soft state” [RFC2201].
Note that routers that do not lead to group members still have to deal with overhead
generated by Prune messages. For wide-area multicasting this technique does not scale
(as discussed in Chapter 6).
1
DVMRP has been used to support MBONE, a multicast service over the Internet, by establishing tunnels
between DVMRP-capable machines; MBONE was used widely in the research community.
DISTANCE VECTOR MULTICAST ALGORITHM 187
Note that:
DVMRP uses its own routing process; this routing is based on hop counts and is
similar to the Routing Information Protocol (RIP). DVMRP does not route unicast
datagrams, hence a router that needs to process both multicast and unicast datagrams must
be configured with two separate routing processes. Note, as a consequence, that the path
that the multicast traffic uses to reach a destination may not be the same as the path that the
unicast traffic uses to reach the same destination. While in principle this is not necessarily
a problem of its own, the implications are that a dual state must be maintained at a router.
A key operation of DVMRP is the neighbor discovery operation. A DVMRP router
dynamically discovers its neighbors by periodically transmitting Neighbor Probe
messages over each of its local interfaces. The transmitted message contains a list of
neighbor routers from which Neighbor Probe messages have been received. Upon
receiving a Probe message that contains its own address in the neighbor list, the pair
of routers establishes a two-way neighbor adjacency relationship. Messages are trans-
mitted to the multicast address 224.0.0.4 “all-DVMRP-routers” (See Chapter 2).
Another key operation is the routing table creation operation. The DVMRP
algorithm is based on hop counts and the algorithm computes the set of reverse paths
identified in the RPF algorithm. As part of the process, the routing table is exchanged
between each neighbor router. The algorithm makes use of a metric configured on every
router interface. Each router advertises the network number, mask, and metric of each
interface; it also advertises routes received from neighbor routers. As is the case for other
distance vector protocols, when a route is received, the interface metric is added to the
advertised metric; this adjusted metric is then used to determine the best upstream path to
the source [PAR200601].
The creation of a list of dependent downstream routers is also important. The
DVMRP algorithm utilizes the exchange of a routing information mechanism to notify
upstream routers that a specific downstream router requires these upstream routers to
forward multicast traffic to this downstream router. DVMRP accomplishes this as
described next. If a downstream router selects an upstream router as the next hop to
a particular source, then the routing updates from the downstream router have a metric set
to a very large number (“infinity”) for the source network. When the upstream DVMRP
router receives the advertisement, it adds the downstream router to the list of dependent
downstream routers for this source. This technique provides the information needed to
prune the multicast delivery tree [PAR200601]. DVMRP prevents the forwarding of
duplicate packets by enforcing the concept of a designated forwarder for each source.
This is accomplished as follows: when routers exchange their routing table, each router
makes note of the peers metric to reach the source network. By convention, it will be the
router with the lowest metric that is responsible for forwarding data to the shared network
188 OTHER DENSE-MODE MULTICAST ROUTING PROTOCOLS: DVMRP AND MOSPF
(if multiple routers have the same metric, then the router with the lowest IP address
becomes the designated forwarder for the network).
Building and maintaining multicast delivery trees is obviously a key operation of the
protocol. The RPF algorithm is used in a DVMRP environment to forward multicast
datagrams. As described in Chapter 3, with RPF, if a datagram is received via the
interface that represents the best path to the source, then the router forwards the datagram
along the set of downstream interfaces. This set contains each downstream interface
included in the multicast delivery tree.
If a multicast router has no dependent downstream neighbors through a specific
interface, the network as seen from that interface is called a leaf network. If a network is a
leaf for a given source, and if there are no members of a particular group on the network,
then there are no recipients for datagrams from the source to the group on that network.
That networks parent router can forgo sending those datagrams on that network; this is
also called “truncating” the shortest path tree. The algorithm that tracks and uses this
information is the Truncated Reverse Path Broadcasting (TRPB) algorithm [RFC1075]. If
a new router connects to a leaf network, packets are forwarded on that network only if there
are hosts/receivers that are members of the specific multicast group as determined through
the IGMP and the local group database that is generated from IGMP transactions. If the
group address is currently listed and the router is the designated forwarder for the source,
then, and only then, the interface is included in the multicast delivery tree.
Pruning of the multicast tree is important in order to optimize network resources. At
first, all networks that are nonleaf networks are included in the multicast delivery tree.
However, routers connected to leaf networks abrogate an interface when there are no
longer any active receivers/members participating in the specific multicast group;
thereafter, multicast packets are no longer forwarded through the interface. If and when
a router is able to remove all of its downstream interfaces for a specific group, it notifies its
upstream neighbor (by sending a Prune message to the upstream neighbor) that it no
longer is in need of traffic from that particular source and group pair. In turn, if the
upstream neighbor receives Prune messages from each of the dependent downstream
routers on a given interface, the upstream router can remove this interface from the
multicast delivery tree. In turn, again, if this upstream router is able to prune all of its
interfaces from the tree, it sends a Prune message to its upstream router. This process
continues until all unused branches have been pruned from the delivery tree. In order to
eliminate the possibility of using outdated prune information, each Prune message
contains a prune lifetime timer that indicates the length of time that the prune is to
remain in effect.
DVMRP routers use graft messages to reattach portions of the network to the
multicast delivery tree. This is needed because IP multicast is required to accommodate
dynamic group membership. A Graft message is sent upon receiving an IGMP MR for a
group that has previously been pruned. Distinct Graft messages are transmitted to the
appropriate upstream neighbor for each source network that has been Pruned. Receipt
of a Graft message is acknowledged with a Graft ACK message; if an acknowledg-
ment is not received within the graft timeout period, the request is retransmitted.
ACKs enable the sender to distinguish between a misrouted graft packet and an
inactive device.
DISTANCE VECTOR MULTICAST ALGORITHM 189
Tunnel
DSLAM
DSLAM DSLAM
. The path taken by a multicast datagram depends on both the datagrams source and
its multicast destination; this is called source/destination routing—this is in
contrast to most unicast datagram forwarding algorithms (as is the case for
OSPF) that route based solely on destination.
. The path taken between the datagrams source and any particular destination
group member is the least cost path available. Cost is expressed in terms of the
OSPF link-state metric. For example, if the OSPF metric represents delay, a
minimum-delay path is chosen. OSPF metrics are configurable. A metric is
assigned to each outbound router interface, representing the cost of sending a
packet on that interface. The cost of a path is the sum of its constituent (outbound)
router interfaces.
. MOSPF takes advantage of any commonality of least cost paths to destination
group members. However, when members of the multicast group are spread out
192 OTHER DENSE-MODE MULTICAST ROUTING PROTOCOLS: DVMRP AND MOSPF
over multiple networks, the multicast datagram must at times be replicated. This
replication is performed as few times as possible (at the tree branches), taking
maximum advantage of common path segments.
. For a given multicast datagram, all routers calculate an identical shortest path tree.
There is a single path between the datagrams source and any particular destina-
tion group member. This means that, unlike OSPFs treatment of regular (unicast)
IP data traffic, there is no provision for equal-cost multipath.
. On each packet hop, MOSPF normally forwards IP multicast datagrams as data
link multicasts. There are two exceptions. First, on nonbroadcast networks, since
there are no data link multicast/broadcast services, the datagram must be
forwarded to specific MOSPF neighbors. Second, a MOSPF router can be
configured to forward IP multicasts on specific networks as data link unicasts,
in order to avoid datagram replication in certain anomalous situations.
As noted, the location of every group member is communicated to the rest of the
network. This ensures that multicast datagrams can be forwarded to each member. OSPF
uses group membership LSAs to track the location of each group member. These LSAs
are stored in the OSPF link-state database that effectively describes the topology of the
AS [PAR200601].
While MOSPF optimizes the path to any given group member, it does not necessarily
optimize the use of the internetwork as a whole. To do so, instead of calculating
source-based shortest path trees, something similar to a minimal spanning tree (contain-
ing only the group members) would need to be calculated. This type of minimal spanning
tree is called a Steiner tree in the literature.
In a multiple-area environment, provisions need to be made to maintain global
topology information because a router is only aware of the network topology within the
local area. Within an OSPF area, the Area Border Router (ABR2) forwards routing
information and data traffic between areas; in an MOSPF environment this function is
performed by an interarea multicast forwarder. The interarea multicast forwarder
forwards group membership information and multicast datagrams between areas.
Because group membership LSAs are only flooded within an area, a process in the
interarea multicast forwarder is needed to convey membership information between
areas. To do this, each interarea multicast forwarder summarizes the attached areas
group membership information and forwards this information to OSPF backbone. This
announcement consists of a group membership LSA listing each group containing
members in the nonbackbone area. The advertisement supports the same function as the
summary LSAs generated in a standard OSPF area. Membership information for the
nonbackbone area is summarized into the backbone; however, this information is not
readvertised into other nonbackbone areas. To forward multicast data traffic between areas,
a wildcard multicast receiver is utilized. This is a router to which all multicast traffic,
regardless of destination, is forwarded. In nonbackbone areas, all interarea multicast
forwarders are wildcard multicast receivers. This ensures that all multicast traffic that
2
Aka Autonomous System Border Router (ASBR).
MULTICAST OSPF 193
REFERENCES
[PAR200601] L. Parziale, W. Liu, et al., TCP/IP Tutorial and Technical Overview, IBM Press,
Redbook Abstract, IBM Form Number GG24-3376-07, 2006.
[RFC1075] RFC 1075, Distance Vector Multicast Routing Protocol, D. Waitzman, C. Partridge, S.
Deering, November 1988
[RFC1584] RFC 1584, Multicast Extensions to OSPF, J. Moy, March 1994.
[RFC2201] RFC 2201, Core Based Trees (CBT) Multicast Routing Architecture, A. Ballardie,
September 1997.
9
IP MULTICASTING IN IPv6
ENVIRONMENTS
This chapter discusses IPv6 and multicast applications. The first part is a short tutorial on
IPv6; the second part looks at multicast-specific issues. IPv6 is now seeing major
deployment in Europe and Asia; eventually, it will also see deployment in North
America.
The IPv6 is now gaining momentum as an improved network layer protocol. There is a lot
of commercial interest and activity in Europe and Asia, and as of press time, there was
also some traction in the United States. For example, the U.S. Department of Defense
(DoD) announced that from October 1, 2003, all new developments and procurements
needed to be IPv6 capable; the DoDs goal was to complete the transition to IPv6 for all
intra- and internetworking across the agency by 2008. In 2005, the U.S. Government
Accountability Office (GAO) recommended that all agencies become proactive in
planning a coherent transition to IPv6. The expectation is that in the next few years,
a transition to this new protocol will occur worldwide [MIN200801].
194
OPPORTUNITIES OFFERED BY IPv6 195
While the basic function of IP is to move information across networks, IPv6 has more
capabilities built into its foundation than IPv4. A key capability is the significant increase
in address space. For example, all devices could have a public IP address, so that they can
be uniquely tracked. Today, inventory management of dispersed Information Technology
(IT) assets cannot be achieved with IP mechanisms; during the inventory cycle, someone
has to manually verify the location of each desktop computer. With IPv6, one can use the
network to verify that such equipment is there; even non-ITequipment in the field can also
be tracked, by having an IP address permanently assigned to it. IPv6 also has extensive
automatic configuration (autoconfiguration) mechanisms and reduces the IT burden,
thereby making configuration essentially plug-and-play.
IP was designed in the 1970s for the purpose of connecting computers that were in
separate geographic locations. Computers on a campus were connected by means of local
networks, but these local networks were separated into essentially stand-alone islands.
Internet, as a name to designate the protocol and more recently the worldwide
information network, simply means “internetwork,” that is, a connection between
networks. In the beginning, the protocol had only military use, but computers from
universities and enterprises were quickly added. The Internet as a worldwide information
network is the result of the practical application of IP, that is, the result of the
interconnection of a large set of information networks [IPV200501]. Starting in the
early 1990s, developers realized that the communication needs of the 21st century
needed a protocol with some new features and capabilities while at the same time
retaining the useful features of the existing protocol.
While link-level communication does not generally require a node identifier
(address) since the device is intrinsically identified with the link-level address,
communication over a group of links (a network) does require unique node identifiers
(addresses). The IP address is an identifier that is applied to each device connected to
an IP network. In this setup, different elements taking part in the network (servers,
INTRODUCTORY OVERVIEW OF IPv6 197
routers, user computers, etc.) communicate among each other using their IP address
as an entity identifier. In IPv4, addresses consist of four octets. For ease of human
conversation, IP addresses are represented as separated by periods, for example,
166.74.110.83, where the decimal numbers are a shorthand (and corresponds to) the
binary code described by the byte in question (an 8-bit number takes a value in the 0–
255 range). Since the IPv4 address has 32 bits there are nominally 232 different IP
addresses (approximately four billions nodes if all combinations are used).
IPv6 is the Internets next-generation protocol, which was at first called IPng
(“Internet Next Generation”). The IETF developed the basic specifications during the
1990s to support a migration to a new environment. IPv6 is defined in RFC 2460, which
obsoletes RFC 1883. [The “version 5” reference was employed for another use (an
experimental real-time streaming protocol), and to avoid any confusion, it was decided
not to use this nomenclature.]
. Scalability: IPv6 has 128-bit addresses versus 32-bit IPv4 addresses. With IPv4,
the theoretical number of available IP addresses are 232 –1010. IPv6 offers a 2128
space. Hence, the number of available unique node addresses is 2128 –1039.
. Security: IPv6 includes security in its specifications such as payload encryption
and authentication of the source of the communication.
. Real-time applications: To provide better support for real-time traffic (e.g., VoIP),
IPv6 includes “labeled flows” in its specifications. By means of this mechanism
routers can recognize the end-to-end flow to which transmitted packets belong.
This is similar to the service offered by MPLS, but it is intrinsic with the IP
mechanism rather than an add-on. Also, it preceded this MPLS feature by a
number of years.
198 IP MULTICASTING IN IPv6 ENVIRONMENTS
. Traditional Class A address: Class A uses the first bit of the 32-bit space (bit 0) to
identify it as a Class A address; this bit is set to 0. Bits 1 – 7 represent the network
ID and bits 8 – 31 identify the PC, terminal device, or host/server on the network.
This address space supports 27 2 ¼ 126 networks and approximately 16 million
devices (224) on each network. By convention, the use of an “all 1s” or “all 0s”
address for both the network ID and the host ID is prohibited (which is the reason
for subtracting the 2 above).
. Traditional Class B address: Class B uses the first 2 bits (bit 0 and bit 1) of the
32-bit space to identify it as a Class B address; these bits are set to 10. Bits 2 – 15
represent the network ID and bits 16 – 31 identify the PC, terminal device, or host/
server on the network. This address space supports 214 2 ¼ 16,382 networks and
216 2 ¼ 65,134 devices on each network.
. Traditional Class C address: Class C uses the first 3 bits (bit 0, bit 1, and bit 2) of
the 32-bit space to identify it as a Class C address; these bits are set to 110. Bits 3 –
23 represent the network ID and bits 24 – 31 identify the PC, terminal device, or
host/server on the network. This address space supports about 2 million networks
(221 – 2) and 28 – 2 ¼ 254 devices on each network.
. Traditional Class D address: This class is used for broadcasting and/or multicasting:
wherein all devices on the network receive the same packet. Class D uses the first 4
bits (bit 0, bit 1, bit 2, and bit 3) of the 32-bit space to identify it as a Class D address;
these bits are set to 1110.
more hosts allowed. Blocks of Class C network numbers are allocated to each network
service provider; organizations using the network service provider for Internet
connectivity are allocated subsets of the service providers address space as required.
These multiple Class C addresses can then be summarized in routing tables, resulting
in fewer route advertisements. The CIDR mechanism can also be applied to blocks of
Class A and B addresses [TEA200401]. All of this assumes, however, that the insti-
tution in question already has an assigned set of public, registered IP addresses; it does
not address the issue of how to get additional public, registered, globally unique IP
addresses.
A number of protocols cannot travel through a NAT device and hence the use of NAT
implies that many applications (e.g., VoIP) cannot be used effectively in all instances. As
a consequence, these applications can only be used in intranets. Examples include
[IPV200501] the following:
The relatively large size of the IPv6 address is designed to be subdivided into
hierarchical routing domains that reflect the topology of the modern-day Internet. The
use of 128 bits provides multiple levels of hierarchy and flexibility in designing
hierarchical addressing and routing. The IPv4-based Internet currently lacks this
flexibility [MSD200401].
The IPv6 address is represented as eight groups of 16 bits each separated by the “:”
character. Each 16-bit group is represented by four hexadecimal digits, that is, each digit
has a value between 0 and F (0,1, 2, . . . A, B, C, D, E, F with A ¼ 10, B ¼ 11, etc., to
F ¼ 15). What follows is an IPv6 address example:
3223:0BA0:01E0:D001:0000:0000:D0F0:0010
INTRODUCTORY OVERVIEW OF IPv6 201
An abbreviated format exists to designate IPv6 addresses when all endings are 0. For
example,
3223:0BA0::
is the abbreviated form of the following address:
3223:0BA0:0000:0000:0000:0000:0000:0000
Similarly, only one 0 is written, removing 0s in the left side and four 0s in the middle
of the address. For example, the address
3223:BA0:0:0:0:0::1234
is the abbreviated form of the following address:
3223:0BA0:0000:0000:0000:0000:0000:1234
There is also a method to designate groups of IP addresses or subnetworks that is
based on specifying the number of bits that designate the subnetwork, beginning from left
to right, using remaining bits to designate single devices inside the network. For example,
the notation
3223:0BA0:01a0::/48
indicates that the part of the IP address used to represent the subnetwork has 48 bits. Since
each hexadecimal digit has 4 bits, this points out that the part used to represent the
subnetwork is formed by 12 digits, that is: “3223:0BA0:01A0.” The remaining digits of
the IP address would be used to represent nodes inside the network.
There are a number of special IPv6 addresses, as follows:
link-local addresses is “the local link”; the reachability of site-local addresses1 is “the
private intranet”; and the reachability of global addresses is “the IPv6-enabled Internet.”
IPv6 interfaces can have multiple addresses that have different reachability scopes. For
example, a node may have a link-local address, a site-local address, and a global address.
Note: IPv6 actually has possible 15 scopes, as hex 0 to hex F; some of these scopes are
unused.
Like IPv4, IPv6 is a connectionless, unreliable datagram protocol used primarily
for addressing and routing packets between hosts. Connectionless means that a session
is not established before exchanging data. Unreliable means that delivery is not
guaranteed. IPv6 always makes a best effort attempt to deliver a packet. An IPv6
packet might be lost, delivered out of sequence, duplicated, or delayed. IPv6 per se does
not attempt to recover from these types of errors. The acknowledgment of packets
delivered and the recovery of lost packets is done by a higher layer protocol, such as
TCP [MSD200401]. From a packet forwarding perspective IPv6 operates just like
IPv4. An IPv6 packet, also known as an IPv6 datagram, consists of an IPv6 header and
an IPv6 payload.
1
Site-local unicast addresses were deprecated by the IETF in 2003; their description herewith is for historical
reference.
INTRODUCTORY OVERVIEW OF IPv6 203
The IPv6 header consists of two parts, the IPv6 base header and optional
extension headers. Functionally, the optional extension headers and upper layer
protocols, for example, TCP, are considered part of the IPv6 payload. Table 9.4 shows
the fields in the IPv6 base header. IPv4 headers and IPv6 headers are not directly
interoperable: hosts and/or routers must use an implementation of both IPv4 and IPv6
in order to recognize and process both header formats. This gives rise to a number of
complexities in the migration process between the IPv4 and the IPv6 environments.
However, techniques have been developed to handle these migrations.
address is invalidated and the address can be reallocated to other interfaces. For the
suitable management of address expiration time, an address goes through two states
(stages) while it is affiliated to an interface [IPV200501]:
clients. RIRs are recommending ISPs and operators allocate to each IPv6 client a /48
subnetwork; this allows clients to manage their own subnetworks without using NAT.
(The implication is that the need for NAT disappears in IPv6.)
In order to allow its maximum scalability, IPv6 uses an approach based on a basic
header, with minimum information. This differentiates it from IPv4 where different
options are included in addition to the basic header. IPv6 uses a header
“concatenation” mechanism to support supplementary capabilities. The advantages
of this approach include the following:
. The size of the basic header is always the same and is well known. The basic
header has been simplified compared with IPv4, since only 8 fields are used
instead of 12. The basic IPv6 header has a fixed size, hence, its processing by
nodes and routers is more straightforward. Also, the headers structure aligns to 64
bits, so that new and future processors (64 bits minimum) can process it in a more
efficient way.
. Routers placed between a source point and a destination point (that is, the route
that a specific packet has to pass through) do not need to process or understand any
“following headers.” In other words, in general, interior (core) points of the
network (routers) only have to process the basic header, while in IPv4 all headers
must be processed. This flow mechanism is similar to the operation in MPLS yet
precedes it by several years.
. There is no limit to the number of options that the headers can support (the IPv6
basic header is 40 octets in length, while the IPv4 one varies from 20 to 60 octets,
depending on the options used).
In IPv6, interior/core routers do not perform packet fragmentation, but the frag-
mentation is performed end to end. That is, source and destination nodes perform, by
means of the IPv6 stack, the fragmentation of a packet and the reassembly, respectively.
The fragmentation process consists of dividing the source packet into smaller packets or
fragments [IPV200501].
A “jumbogram” is an option that allows an IPv6 packet to have a payload greater
than 65,535 bytes. Jumbograms are identified with a 0 value in the payload length in the
IPv6 header field and include a jumbo payload option in the hop-by-hop option header. It
is anticipated that such packets will be used, in particular, for multimedia traffic.
This preliminary overview of IPv6 highlights the advantages of the new protocol and
its applicability to a whole range of applications, including VoIP.
embedded IPv4 addresses. Tunneling, which we already described in passing, will play
a major role in the beginning:
There are a number of requirements that are typically applicable to an organization
wishing to introduce an IPv6 service [6NE200501]:
. The existing IPv4 service should not be adversely disrupted (e.g., as it might be by
router loading of encapsulating IPv6 in IPv4 for tunnels).
. The IPv6 service should perform as well as the IPv4 service (e.g., at the IPv4 line
rate and with similar network characteristics).
. The service must be manageable and be able to be monitored (thus tools should be
available for IPv6 as they are for IPv4).
. The security of the network should not be compromised due to the additional
protocol itself or weakness of any transition mechanism used.
. An IPv6 address allocation plan must be drawn up.
. Dual IP layer (also known as dual stack): A technique for providing complete
support for both IPs—IPv4 and IPv6—in hosts and routers.
. Configured tunneling of IPv6 over IPv4: Point-to-point tunnels made by encap-
sulating IPv6 packets within IPv4 headers to carry them over IPv4 routing
infrastructures.
. Automatic tunneling of IPv6 over IPv4: A mechanism for using IPv4-compatible
addresses to automatically tunnel IPv6 packets over IPv4 networks.
Applications (and the lower layer protocol stack) need to be properly equipped.
There are four cases [RFC4038]:
Case 1: IPv4-only applications in a dual-stack node. IPv6 is introduced in a
node, but applications are not yet ported to support IPv6. The protocol stack is as
follows:
—
—
Case 3: Applications supporting both IPv4 and IPv6 in a dual-stack node. Applica-
tions are ported for both IPv4 and IPv6 support. Therefore, the existing IPv4 applications
can be removed. The protocol stack is as follows:
—
MULTICAST WITH IPv6 211
The first two cases are not interesting in the longer term; only a few applications are
inherently IPv4 or IPv6 specific and should work with both protocols without having to
care about which one is being used.
Next, we focus on multicast issues, including layer 3 addressing. Note at this juncture that
IPv6 multicast does not support DM.
128 bits
8 bits 8 bits
Group ID
1111 1111
F F Flags Scope
1 = node
2 = link
5 = site
8 = organization
E = global
0 RP T
R= RP address embedded
T= 0 if permanent, 1 if temporary
P= Prefix for unicast-based assignments
IPv6 Multicast
address
33 33 FD 0F FE 17
Multicast prefix
for Ethernet multicast
Corresponding
Ethernet address
9.4.3 Signaling
Just as is the case in IPv4, IPv6 hosts (receivers) must signal a router with its desire to
receive data from a specific group. IPv6 multicast does not use IGMP but rather uses
MLD. MLDv1 is similar to IGMPv2, and MLDv2 is similar to IGMPv3. This topic is
revisited in Chapter 10.
9.4.4 RP Approaches
Recall that in SM PIM sources must send their traffic to an RP; this traffic is in turn
forwarded to receivers on a shared distribution tree. In IPv6, auto-RP is not currently
available; however, there is a BSR for IPv6; also there is static configuration of an RP
(embedded RP). Static use is acceptable in the intradomain, but not within the
interdomain. Embedded RP is a viable solution for those applications that cannot
leverage SSM and that require a PIM SM model to interoperate across multiple
domains [CIS200701]. *Embedded RP uses the R flag discussed above: when the flags
R, P, and T are set to 1, this indicates that the RP address is embedded in the group
address.
REFERENCES
[6NE200501] 6NET, D2.2.4: Final IPv4 to IPv6 Transition Cookbook for Organisational/ISP
(NREN) and Backbone Networks, Version:1.0, Project Number: IST-2001-32603, CEC
Deliverable Number 32603/UOS/DS/2.2.4/A1, February 4, 2005.
[CIS200701] Cisco Systems, Internet Protocol (IP) Multicast Technology Overview and White
Papers, Cisco Systems, San Jose, CA.
[DAV200201] J. Davies, Understanding IPv6, Microsoft Press, 2002.
[DEM200301] Desmeules, Cisco Self-Study: Implementing IPv6 Networks (IPV6), Pearson
Education, May 2003.
[GON199801] M. Goncalves, K. Niles, IPv6 Networks, McGraw-Hill Osborne, 1998.
[GOS200301] S.Goswami, Internet Protocols: Advances, Technologies, and Applications, Kluwer
Academic Publishers, May 2003.
[GRA200001] B. Graham, TCP/IP Addressing: Designing and Optimizing Your IP Addressing
Scheme, 2nd ed., Morgan Kaufmann, 2000.
[HAG200201] S. Hagen, IPv6 Essentials, OReilly, 2002.
[HUI199701] C. Huitema, IPv6 the New Internet Protocol, 2nd ed., Prentice-Hall, 1997.
[IPV200401] IPv6Forum, IPv6 Vendors Test Voice, Wireless and Firewalls on Moonv6, http://www.
ipv6forum.com/modules.php?op¼modload&name¼News&file¼article&sid¼15&mode¼
thread&order¼0&thold¼0, November 15, 2004.
[IPV200501] IPv6 Portal, http://www.ipv6tf.org/meet/faqs.php.
[ITO200401] J. Itojun Hagino, IPv6 Network Programming, Butterworth-Heinemann, 2004.
[LEE200501] H. K. Lee, Understanding IPv6, Springer-Verlag, New York, 2005.
[LOS200301] P. Loshin, IPv6: Theory, Protocol, and Practice, 2nd ed., Elsevier Science &
Technology Books, 2003.
214 IP MULTICASTING IN IPv6 ENVIRONMENTS
[MIL199701] M.A. Miller, Implementing IPv6: Migrating to the Next Generation Internet
Protocol, Wiley, 1997.
[MIL200001] M. Miller, P. E. Miller, Implementing IP V6: Supporting the Next Generation
Internet Protocols, 2nd ed., Hungry Minds, 2000.
[MIN200601] D. Minoli, VoIP over IPv6, Elsevier, 2006.
[MIN200801] J. J. Amoss, D. Minoli, Handbook of IPv4 to IPv6 Transition Methodologies for
Institutional and Corporate Networks, TF-ARBCH, New York, 2008.
[MSD200401] Microsoft Corporation, MSDN Library, Internet Protocol, http://msdn.microsoft.
com, 2004.
[MUR200501] N. R. Murphy, D. Malone, IPv6 Network Administration, OReilly & Associates,
2005.
[RFC2460] RFC 2460, Internet Protocol, Version 6 (IPv6) Specification, S. Deering, R. Hinden,
December 1998.
[RFC2893] RFC 2893, Transition Mechanisms for IPv6 Hosts and Routers, R. Gilligan, E.
Nordmark, August 2000.
[RFC3022] RFC 3022, Traditional IP Network Address Translator (Traditional NAT), P. Srisuresh,
K. Egevang, January 2001.
[RFC3306] RFC 3306, Unicast-Prefix-Based IPv6 Multicast.
[RFC4038] RFC 4038, Application Aspects of IPv6 Transition, M-K. Shin, Ed., Y-G. Hong, J.
Hagino, P. Savola, E.M. Castro, March 2005.
[SOL200401] H. S. Soliman, Mobile IPv6, Pearson Education, 2004.
[TEA200401] D. Teare, C. Paquet, CCNP Self-Study: Advanced IP Addressing, Cisco Press, June
11, 2004.
[WEG199901] J. D. Wegner, IP Addressing and Subnetting, Including IPv6, Elsevier Science &
Technology Books, 1999.
10
MULTICAST LISTENER
DISCOVERY
Just as is the case in IPv4, IPv6 hosts (receivers) signal a router with their desire to receive
data from a specific group. IPv6 multicast does not use the IGMP but rather the Multicast
Listener Discovery (MLD) protocol. MLD is used by an IPv6 router to discover the
presence of multicast listeners on directly attached links and to discover which multicast
addresses are of interest to those neighboring nodes. MLDv1 is similar to IGMPv2, and
MLDv2 is similar to IGMPv3. MLDv2 adds the ability for a node to report interest in
listening to packets with a particular multicast address only from specific source
addresses, or from all sources except for specific source addresses, this being similar
to SSM. Recall that SSM is a form of multicast where a receiver must specify both the
network layer address of the source and the multicast destination address to receive the
multicast datagrams of interest.
This chapter provides an overview of MLD based on RFC 2710,1 Multicast
Listener Discovery (MLD) for IPv6, and RFC 3810, Multicast Listener Discovery
Version 2 (MLDv2) for IPv6. The focus is on MLDv1. Due to the commonality of
1
Copyright (C) The Internet Society (1999). All rights reserved. This document and translations of it may be
copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its
implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction of
any kind provided that the copyright notice and this paragraph are included on all such copies and derivative
works.
215
216 MULTICAST LISTENER DISCOVERY
function, the term Group Management Protocol (GMP) is sometimes used to refer to both
IGMP and MLD.
MLD [RFC 2710, RFC 3550, RFC 3810] specifies the protocol used by an IPv6 router to
discover the presence of multicast listeners (i.e., nodes wishing to receive multicast
packets) on its directly attached links and to discover which multicast addresses are of
interest to those neighboring nodes. MLD enables IPv6 routers to discover the presence
of multicast listeners. This information is then provided to the multicast routing protocol
being used by the router to ensure that multicast packets are delivered to all links where
there are interested receivers. MLD is derived from Version 2 of IPv4s IGMPv2. One
important difference is that MLD uses ICMPv6 message types (IP 58) rather than IGMP
message types (IP 2).
MLD is an asymmetric protocol, specifying different behaviors for multicast
listeners and for routers. For those multicast addresses to which a router itself is listening,
the router performs both parts of the protocol, the “multicast router part” and the
“multicast address listener part,” including responding to its own messages. If a router
has more than one interface to the same link, it needs to perform the router part of the
MLD over only one of those interfaces. Listeners, on the contrary, must perform the
listener part of MLD on all interfaces from which an application or upper layer protocol
has requested reception of multicast packets.
Note that a multicast router may itself be a listener of one or more multicast
addresses; in this case it performs both the multicast router part and the multicast address
listener part of the protocol to collect the multicast listener information needed by its
multicast routing protocol on the one hand and to inform itself and other neighboring
multicast routers of its listening state on the other hand.
MLD is a subprotocol of ICMPv6, namely, MLD message types are a subset of the
set of ICMPv6 messages, and MLD messages are identified in IPv6 packets by a
preceding next header value of 58. All MLD messages are sent with a link-local IPv6
source address, an IPv6 hop limit of 1, and an IPv6 Router Alert option in a Hop-by-Hop
Options header. (The Router Alert option is necessary to cause routers to examine
MLD messages sent to multicast addresses in which the routers themselves have no
interest.)
MLD messages have the format depicted in Figure 10.1.
2
The discussion of Sections 10.1–10.5 is based on and summarized from RFC 2710.
MESSAGE FORMAT 217
Multicast Address
1. Multicast Listener Query (type ¼ decimal 130), also known as “Query.” There
are two subtypes of Multicast Listener Query messages (differentiated by the
contents of the Multicast Address field):
General Query, used to learn which multicast addresses have listeners on an
attached link.
Multicast-Address-Specific Query, used to learn if a particular multicast
address has any listeners on an attached link.
2. Multicast Listener Report (type ¼ decimal 131), also known as “Report.”
3. Multicast Listener Done (type ¼ decimal 132), also known as “Done.”
The length of a received MLD message is computed by taking the IPv6 payload
length value and subtracting the length of any IPv6 extension headers present between
the IPv6 header and the MLD message. If that length is greater than 24 octets, it indicates
that there are other fields present beyond the fields described above, perhaps belonging to
a future backwards-compatible version of MLD. An implementation of the version of
MLD specified in this document must not send an MLD message longer than 24 octets,
and must ignore anything past the first 24 octets, of a received MLD message. In all cases,
the MLD checksum must be computed over the entire MLD message, and not just the first
24 octets.
Routers use MLD to learn that multicast addresses have listeners on each of their attached
links. Each router keeps a list, for each attached link, of which multicast addresses have
listeners on that link and a timer associated with each of those addresses. Note that the
router needs to learn only that listeners for a given multicast address are present on a link;
it does not need to learn the identity (e.g., unicast address) of those listeners or even how
many listeners are present.
For each attached link, a router selects one of its link-local unicast addresses on that
link to be used as the IPv6 source address in all MLD packets it transmits on that link.
For each interface over which the router is operating the MLD protocol, the router
must configure that interface to listen to all link-layer multicast addresses that can be
generated by IPv6 multicasts. For example, an Ethernet-attached router must set its
Ethernet address reception filter to accept all Ethernet multicast addresses that start
with the hexadecimal value 3333 (as covered in Chapter 9); in the case of an Ethernet
interface that does not support the filtering of such a range of multicast addresses, it
must be configured to accept all Ethernet multicast addresses to meet the requirements of
MLD.
With respect to each of its attached links, a router may assume one of the two
roles: querier or nonquerier. There is normally only one querier per link. All routers
start up as a querier on each of their attached links. If a router hears a Query message
whose IPv6 source address is numerically less than its own selected address for that
link, it must become a nonquerier on that link. If the timer [Other Querier Present
Interval] passes without receiving, from a particular attached link, any queries from a
router with an address less than its own, a router resumes the role of querier on that link.
A querier for a link periodically sends (with timer [Query Interval]) a general
query on that link to solicit reports of all multicast addresses of interest on that link. On
startup, a router should send as many general queries as specified by [Startup Query
Count] spaced closely together (based on the [Startup Query Interval] timer) on all
attached links to quickly and reliably discover the presence of multicast listeners on
those links.
General queries are sent to the link-scope all-nodes multicast address (FF02::1) with
a Multicast Address field of 0 and a maximum response delay defined by the timer
[Query Response Interval].
PROTOCOL DESCRIPTION 219
When a node receives a general query, it sets a delay timer for each multicast address
to which it is listening on the interface from which it received the query, excluding the
link-scope all-nodes address and any multicast addresses of scope 0 (reserved) or 1
(node-local). Each timer is set to a different random value, using the highest
clock granularity available on the node, selected from the range [0,Maximum Response
Delay] with the maximum response delay as specified in the query packet. If a
timer for any address is already running, it is reset to the new random value only if
the requested maximum response delay is less than the remaining value of the running
timer. If the query packet specifies a maximum response delay of zero, each timer is
effectively set to zero, and the action specified below for timer expiration is performed
immediately.
When a node receives a multicast-address-specific query, if it is listening to the
queried multicast address on the interface from which the query was received, it sets a
delay timer for that address to a random value selected from the range [0,Maximum
Response Delay], as above. If a timer for the address is already running, it is reset to the
new random value only if the requested maximum response delay is less than the
remaining value of the running timer. If the query packet specifies a maximum response
delay of zero, the timer is effectively set to zero, and the action specified below for timer
expiration is performed immediately.
If a nodes timer for a particular multicast address on a particular interface expires,
the node transmits a report to that address via that interface; the address being reported is
carried in both the IPv6 Destination Address field and the MLD Multicast Address field
of the report packet. The IPv6 hop limit of 1 (as well as the presence of a link-local
IPv6 source address) prevents the packet from traveling beyond the link to which the
reporting interface is attached.
If a node receives another nodes report from an interface for a multicast address
while it has a timer running for that same address on that interface, it stops its timer
and does not send a report for that address, thus suppressing duplicate reports on the
link.
When a router receives a report from a link, if the reported address is not already
present in the routers list of multicast addresses having listeners on that link, the reported
address is added to the list, its timer is set to [Multicast Listener Interval], and its
appearance is made known to the routers multicast routing component. If a report is
received for a multicast address that is already present in the routers list, the timer for
that address is reset to [Multicast Listener Interval]. If an addresss timer expires, it is
assumed that there are no longer any listeners for that address present on the link; so it is
deleted from the list and its disappearance is made known to the multicast routing
component.
When a node starts listening to a multicast address on an interface, it should
immediately transmit an unsolicited report for that address on that interface in case it is
the first listener on the link. To cover the possibility of the initial report being lost or
damaged, it is recommended that it be repeated once or twice after short delays
[Unsolicited Report Interval] (a simple way to accomplish this is to send the initial
report and then act as if a multicast-address-specific query was received for that address
and set a timer appropriately).
220 MULTICAST LISTENER DISCOVERY
Node behavior is more formally specified by the state transition diagram below. A node
may be in one of the three possible states with respect to any single IPv6 multicast address
on any single interface:
. “Nonlistener” state, when the node is not listening to the address on the interface
(i.e., no upperlayer protocol or application has requested reception of packets to
that multicast address). This is the initial state for all multicast addresses on all
interfaces; it requires no storage in the node.
. “Delaying listener” state, when the node is listening to the address on the interface
and has a report delay timer running for that address.
. “Idle listener” state, when the node is listening to the address on the interface and
does not have a report delay timer running for that address.
There are five significant events that can cause MLD state transitions:
. “Start listening” occurs when the node starts listening to the address on the
interface. It may occur only in the nonlistener state.
NODE STATE TRANSITION DIAGRAM 221
. “Stop listening” occurs when the node stops listening to the address on the
interface. It may occur only in the delaying listener and idle listener states.
. “Query received” occurs when the node receives either a valid General Query
message or a valid Multicast-Address-Specific Query message. To be valid, the
Query message must come from a link-local IPv6 source address, be at least 24
octets long, and have a correct MLD checksum. The Multicast Address field in the
MLD message must contain either zero (a general query) or a valid multicast
address (a multicast-address-specific query). A general query applies to all
multicast addresses on the interface from which the query is received. A multicast-
address-specific query applies to a single multicast address on the interface
from whichthe query isreceived.Queriesare ignored foraddresses inthenonlistener
state.
. “Report received” occurs when the node receives a valid MLD Report message. To
be valid, the Report message must come from a link-local IPv6 source address, be
at least 24 octets long, and have a correct MLD checksum. A report applies only to
the address identified in the Multicast Address field of the Report, on the interface
from which the report is received. It is ignored in the nonlistener or idle listener
state.
. “Timer expired” occurs when the report delay timer for the address on the
interface expires. It may occur only in the delaying listener state.
All other events, such as receiving invalid MLD messages or MLD message types other
than Query or Report, are ignored in all states.
There are seven possible actions that may be taken in response to the above events:
. “Send report” for the address on the interface. The Report message is sent to the
address being reported.
. “Send done” for the address on the interface. If the flag saying we were the last
node to report is cleared, this action may be skipped. The Done message is sent to
the link-scope all-routers address (FF02::2).
. “Set flag” that we were the last node to send a report for this address.
. “Clear flag” since we were not the last node to send a report for this address.
. “Start timer” for the address on the interface using a delay value chosen uniformly
from the interval [0,Maximum Response Delay], where the maximum response
delay is specified in the query. If this is an unsolicited report, the timer is set to a
delay value chosen uniformly from the interval [0,Unsolicited Report Interval].
. “Reset timer” for the address on the interface to a new value using a delay value
chosen uniformly from the interval [0, Maximum Response Delay], as described
in “start timer.”
. “Stop timer” for the address on the interface.
See Figure 10.2 for the protocol state machine. In all of the following state transition
diagrams, each state transition arc is labeled with the event that causes the transition
222 MULTICAST LISTENER DISCOVERY
query
Listener Listener
query timer
and, in parentheses, any actions taken during the transition. Note that the transition is
always triggered by the event; even if the action is conditional, the transition still
occurs.
The link-scope all-nodes address (FF02::1) is handled as a special case. The node
starts in the idle listener state for that address on every interface, never transitions to
another state, and never sends a Report or Done message for that address.
MLD messages are never sent for multicast addresses whose scope is 0 (reserved) or
1 (node-local).
MLD messages are sent for multicast addresses whose scope is 2 (link-local),
including solicited-node multicast addresses, except for the link-scope all-nodes address
(FF02::1).
ROUTER STATE TRANSITION DIAGRAM 223
. “Querier,” when this router is designated to transmit MLD queries on this link.
. “Nonquerier,” when there is another router designated to transmit MLD queries on
this link.
The following three events can cause the router to change states:
. “Query timer expired” occurs when the timer set for query transmission expires.
This event is significant only when in the querier state.
. “Query received from a router with a lower IP address” occurs when a valid
MLD query is received from a router on the same link with a lower IPv6
source address. To be valid, the Query message must come from a link-local
IPv6 source address, be at least 24 octets long, and have a correct MLD
checksum.
. “Other querier present timer expired” occurs when the timer set to note the
presence of another querier with a lower IP address on the link expires. This event
is significant only when in the nonquerier state.
There are three actions that may be taken in response to the above events:
. “Start general query timer” for the attached link to [Query Interval].
. “Start other querier present timer” for the attached link to [Other Querier Present
Interval].
. “Send general query” on the attached link. The general query is sent to the
link-scope all-nodes address (FF02::1) and has a maximum response delay of
[Query Response Interval].
A router starts in the Initial state on all attached links and immediately transitions to
the querier state. In addition, to keep track of which multicast addresses have listeners, a
router may be in one of the three possible states with respect to any single IPv6 multicast
address on any single attached link:
. “No listeners present” state, when there are no nodes on the link that have sent a
report for this multicast address. This is the initial state for all multicast addresses
on the router; it requires no storage in the router.
. “Listeners present” state, when there is a node on the link that has sent a report for
this multicast address.
. “Checking listeners” state, when the router has received a Done message but has
not yet heard a report for the identified address.
There are five significant events that can cause router state transitions:
. “Report received” occurs when the router receives a report for the address from the
link. To be valid, the Report message must come from a link-local IPv6 source
address, be at least 24 octets long, and have a correct MLD checksum.
. “Done received” occurs when the router receives a Done message for the address
from the link. To be valid, the Done message must come from a link-local IPv6
source address, be at least 24 octets long, and have a correct MLD checksum. This
event is significant only in the listerners present state and when the router is a
querier.
ROUTER STATE TRANSITION DIAGRAM 225
There are seven possible actions that may be taken in response to the above
events:
. “Start timer” for the address on the link — also resets the timer to its initial value
[Multicast Listener Interval] if the timer is currently running.
. “Start timer*” for the address on the link — this alternate action sets the timer to
the minimum of its current value and either [Last Listener Query Interval] ·
[Last Listener Query Count] if this router is a querier or the maximum response
delay in the query message · [Last Listener Query Count] if this router is a
nonquerier.
. “Start retransmit timer” for the address on the link [Last Listener Query Interval].
. “Clear retransmit timer” for the address on the link.
. “Send multicast-address-specific query” for the address on the link. The multi-
cast-address-specific query is sent to the address being queried and has a
maximum response delay of [Last Listener Query Interval].
. “Notify routing þ” internally notify the multicast routing protocol that there are
listeners to this address on this link.
. “Notify routing ” internally notify the multicast routing protocol that there are
no longer any listeners to this address on this link.
The state diagrams that follow apply per group per link (one for routers in the querier
state and one for routers in the nonquerier state). The transition between querier and
nonquerier states on a link is handled specially. All groups on that link in the no listeners
present or listeners present state switch state transition diagrams when the querier/
nonquerier state transition occurs. However, any groups in checking listeners state
continue with the same state transition diagram until the checking listeners state is exited.
For example, a router that starts as a querier, receives a Done message for a group, and
then receives a query from a router with a lower address (causing a transition to the
nonquerier state) continues to send multicast-address-specific queries for the group in
question until it either receives a report or its timer expires, at which time it starts
performing the actions of a nonquerier for this group.
226 MULTICAST LISTENER DISCOVERY
The state transition diagram for a router in the querier state is shown in Figure 10.4.
The state transition diagram for a router in the nonquerier state is similar, but
nonqueriers do not send any messages and are only driven by message reception. See
Figure 10.5.
The MLDv2 protocol, when compared to MLDv1, adds support for “source filtering,”
that is, the ability for a node to report interest in listening to packets only from specific
source addresses, as required to support source-specific multicast defined in RFC 3569,
or from all but specific source addresses sent to a particular multicast address. MLDv2 is
designed to be interoperable with MLDv1. RFC 3810 (June 2004) updates RFC 2710.
Below is a summary of MLDv2 based directly on the RFC. Developers and interested
parties should consult the RFC outright.
OVERVIEW OF MLDv2 227
Figure 10.5. The State Transition Diagram for a Router in Nonquerier State
router to be in the querier state. This router is called the querier. All multicast routers on
the subnet listen to the messages sent by multicast address listeners and maintain the
same multicast listening information state so that they can take over the querier role
should the present querier fail. Nevertheless, only the querier sends periodic or triggered
query messages on the subnet.
A multicast address listener performs the listener part of the MLDv2 protocol on all
interfaces on which multicast reception is supported, even if more than one of those
interfaces are connected to the same link.
message. The state change report contains either filter mode change records, source list
change records, or records of both types.
Both router and listener state changes are mainly triggered by the expiration of a
specific timer or the reception of an MLD message (listener state change can also be
triggered by the invocation of a service interface call). Therefore, to enhance protocol
robustness, in spite of the possible unreliability of message exchanges, messages are
retransmitted several times. Furthermore, timers are set so as to take into account the
possible message losses and to wait for retransmissions.
Periodic general queries and current state reports do not apply this rule in order to not
to overload the link; it is assumed that, in general, these messages do not generate state
changes, their main purpose being to refresh the existing state. Thus, even if one such
message is lost, the corresponding state will be refreshed during the next reporting
period.
As opposed to current state reports, state change reports are retransmitted several
times in order to avoid them being missed by one or more multicast routers. The number
of retransmissions depends on the so-called robustness variable. This variable allows
tuning the protocol according to the expected packet loss on a link. If a link is expected to
be lossy (e.g., a wireless connection), the value of the robustness variable may be
increased. MLD is robust to [Robustness Variable] 1 packet losses. The RFC recom-
mends a default value of 2 for the robustness variable.
If more changes to the same per-interface state entry occur before all the retransmis-
sions of the state change report for the first change have been completed, each additional
change triggers the immediate transmission of a new state change report. Retransmis-
sions of the new state change report will be scheduled as well, in order to ensure that each
instance of state change is transmitted at least [Robustness Variable] times.
If a node on a link expresses, through a state change report, its desire to no longer
listen to a particular multicast address (or source), the querier must query for other
listeners of the multicast address (or source) before deleting the multicast address (or
source) from its multicast address listener state and stopping the corresponding
traffic. Thus, the querier sends a multicast address-specific query to verify whether
there are nodes still listening to a specified multicast address or not. Similarly, the
querier sends a multicast address- and source-specific query to verify whether, for a
specified multicast address, there are nodes still listening to a specific set of sources
or not.
Both multicast address-specific queries and multicast address- and source-
specific queries are only sent in response to state change reports and never in response
to current state reports. This distinction between the two types of reports is needed to
avoid the router treating all multicast listener reports as potential changes in state. By
doing so, the fast leave mechanism of MLDv2 might not be effective if a state change
report is lost, and only the following current state report is received by the router.
Nevertheless, it avoids an increased processing at the router, and it reduces the MLD
traffic on the link.
Nodes respond to the above queries through current state reports, which contain their
per-interface multicast address listening state, only for the multicast addresses (or
sources) being queried.
230 MULTICAST LISTENER DISCOVERY
As stated earlier, in order to ensure protocol robustness, all the queries, except the
periodic general queries, are retransmitted several times within a given time interval. The
number of retransmissions depends on the robustness variable. If, while scheduling
new queries, there are pending queries to be retransmitted for the same multicast address,
the new queries and the pending queries have to be merged. In addition, host reports
received for a multicast address with pending queries may affect the contents of those
queries.
Protocol robustness is also enhanced through the use of the S flag (suppress
router-side processing). As described above, when a multicast address-specific or a
multicast Address- and source-specific query is sent by the querier, a number of
retransmissions of the query are scheduled. In the original (first) query the S flag is
clear. When the querier sends this query, it lowers the timers for the concerned multicast
address (or source) to a given value; similarly, any nonquerier multicast router that
receives the query lowers its timers in the same way. Nevertheless, while waiting for the
next scheduled queries to be sent, the querier may receive a report that updates the timers.
The scheduled queries still have to be sent, in order to ensure that a nonquerier router
keeps its state synchronized with the current querier (the nonquerier router might have
missed the first query). Nevertheless, the timers should not be lowered again, as a valid
answer was already received. Therefore, in subsequent queries the querier sets the S flag.
its desire to stop listening to a specific source, all the multicast routers on the link lower
their timers for that source to a given value. The querier then sends a multicast address
and source-specific query to verify whether there are other listeners for that source on the
link or not. If a report that includes this source is received before the timer expiration, all
the multicast routers on the link update the source timer. If not, the source is deleted from
the include list.
A router is in the EXCLUDE mode for a specific multicast address on a given
interface if there is at least one listener in the EXCLUDE mode for that address on the
link. When the first report is received from such a listener, the router sets the filter timer
that corresponds to that address. This timer is reset each time an EXCLUDE mode
listener confirms its listening state through a current state report. The timer is also
updated when a listener, formerly in the INCLUDE mode, announces its filter mode
change through a State Change Report message. If the filter timer expires, it means that
there are no more listeners in the EXCLUDE mode on the link. In this case, the router
switches back to the INCLUDE mode for that multicast address.
When the router is in the EXCLUDE mode, the router state is represented by the
notation EXCLUDE (X,Y), where X is called the “requested list” and Y is called
the “exclude list.” All sources, except those from the exclude list, will be forwarded
by the router. The requested list has no effect on forwarding. Nevertheless, the router
has to maintain the requested list for the following two reasons:
1. To keep track of sources that listeners in the INCLUDE mode listen to. This is
necessary to assure a seamless transition of the router to the INCLUDE mode
when there is no listener in the EXCLUDE mode left. This transition should not
interrupt the flow of traffic to listeners in the INCLUDE mode for that multicast
address. Therefore, at the time of the transition, the requested list should
contain the set of sources that nodes in the INCLUDE mode have explicitly
requested.
When the router switches to the INCLUDE mode, the sources in the
requested list are moved to the include list, and the exclude list is deleted.
Before switching, the requested list can contain an inexact guess of the sources
that listeners in the INCLUDE mode listen to—might be too large or too small.
These inexactitudes are due to the fact that the requested list is also used for fast
blocking purposes, as described below. If such a fast blocking is required, some
sources may be deleted from the requested list to reduce the router state.
Nevertheless, in each such case the filter timer is updated as well. Therefore,
listeners in the INCLUDE mode will have enough time, before an eventual
switching, to reconfirm their interest in the eliminated source(s) and rebuild the
requested list accordingly. The protocol ensures that when a switch to the
INCLUDE mode occurs, the requested list is accurate.
2. To allow the fast blocking of previously unblocked sources. If the router receives
a report that contains such a request, the concerned sources are added to the
requested list. Their timers are set to a given small value, and a multicast address
and source-specific query is sent by the querier to check whether there are nodes
on the link still interested in those sources or not. If no node announces its interest
232 MULTICAST LISTENER DISCOVERY
in receiving those specific source, the timers of those sources expire. Then, the
sources are moved from the requested list to the exclude list. From then onwards,
the sources will be blocked by the router.
What follows is a brief discussion of source filtering based on RFC 4604. The term
“Source Filtering GMP (SFGMP)” is used to refer jointly to the IGMPv3 and MLDv2
group management protocols. The use of source-specific multicast is facilitated by small
changes to the SFGMP protocols on both hosts and routers. SSM defines general
requirements that must be followed by systems that implement the SSM service model;
this document defines the concrete application of those requirements to systems that
implement IGMPv3 and MLDv2. In doing so, RFC 4604 defines modifications to the
host and router portions of IGMPv3 and MLDv2 for use with SSM and presents a number
of clarifications to their behavior when used with SSM addresses. RFC 4604 updates the
IGMPv3 and MLDv2 specifications.
One should note that SSM can be used by any host that supports source filtering APIs
and whose operating system supports the appropriate SFGMP. The SFGMP modifica-
tions, as described in RFC 4604, make SSM work better on an SSM-aware host (but they
are not strict prerequisites for the use of SSM).
The 232/8 IPv4 address range is currently allocated for SSM by the IANA. In IPv6,
the FF3x::/32 range (where x is a valid IPv6 multicast scope value) is reserved for SSM
semantics, although today SSM allocations are restricted to FF3x::/96.
A host that knows the SSM address range and is capable of applying SSM semantics
to it is described as an “SSM-aware” host. A host or router may be configured to apply
SSM semantics to addresses other than those in the IANA-allocated range. The GMP
module on a host or router should have a configuration option to set the SSM address
range(s). If this configuration option exists, it must default to the IANA-allocated SSM
range. The mechanism for setting this configuration option must at least allow for manual
configuration. Protocol mechanisms to set this option may be defined in the future.
If the host IP module of an SSM-aware host receives a non-source-specific request
to receive multicast traffic sent to an SSM destination address, it should return an
error to the application. On a non-SSM-aware host, an application that uses the
wrong API [e.g., “join(G),” “IPMulticastListen(G,EXCLUDE(S1))” for IGMPv3 or
“IPv6MulticastListen(G,EXCLUDE(S2))” for MLDv2] to request delivery of packets
sent to an SSM address will not receive the requested service because an SSM-aware
router (following the rules of this document) will refuse to process the request, and
the application will receive no indication other than a failure to receive the requested
traffic.
RFC 4604 documents the behavior of an SSM-aware host with respect to sending
and receiving the following GMP message types:
REFERENCES
[RFC2710] RFC 2710, Multicast Listener Discovery (MLD) for IPv6, S. Deering, W. Fenner, B.
Haberman, October 1999, updated by RFC 3590, RFC 3810.
[RFC3810] RFC 3810, Multicast Listener Discovery Version 2 (MLDv2) for IPv6, R. Vida, Ed.,
L. Costa, Ed., June 2004.
[RFC4604] RFC 4604, Using Internet Group Management Protocol Version 3 (IGMPv3) and
Multicast Listener Discovery Protocol Version 2 (MLDv2)for Source-Specific Multicast, H.
Holbrook, B. Cain, B. Haberman, August 2006.
11
IPTV APPLICATIONS
Entertainment is “big business” all over the world. Major markets include North
America, Europe, Asia, and South America. The annual residential cable TV revenue
was estimated to be $75 billion in 2007 in the United States alone, providing services to
about 66 million subscribers. IPTV services are initially targeted by traditional telephone
companies as a way to enter the just-named market; eventually cable TV companies may
adopt the same technology. traditional telephone companies (“telcos”) have seen
erosions in their revenues and they seek ways to increase their ARPU (Average Revenue
Per User) using video services and “triple/quadruple play.”1 This chapter provides
a terse overview to the topic; the discussion is not intended to be comprehensive or
to be completely systematic: to do justice to the topic an entire lengthy textbook is
needed.
234
OVERVIEW AND MOTIVATION 235
on-demand video content over IP-based networks, while meeting all prerequisite quality
of service, quality of experience, conditional access (security), blackout management
(for sporting events), emergency alert system, closed captions, parental controls,
Nielsen rating collection, secondary audio channel, picture-in-picture, and guide
data requirements of the content providers and/or regulatory entities. Typically,
IPTV makes use of MPEG-4 encoding to deliver 200–300 SD channels and
20–40 HD channels; viewers need to be able to switch channels within 2 s or less; also,
the need exists to support multi-STB/multiprogramming (say two to four) within a single
domicile.
IPTV (also known in some quarters as telco TV) is not to be confused with simple
delivery of video over an IP network, including video streaming, which has been possible
for over two decades; IPTV supports all business, billing, provisioning, and content
protection requirements that are associated with commercial video distribution. IP-based
service needs to be comparable to that received over cable TVor DBS. In addition to TV
sets, the content may also be delivered to a personal computer. MPEG-4, which operates
at 2.5 Mbps for SD video and 8–11 Mbps for HD video, is critical to telco-based video
delivery over a copper-based plant because of the bandwidth limitations of that plant,
particularly when multiple simultaneous streams need to be delivered to a domicile;
MPEG-2 would typically require a higher bit rate for the same perceived video quality
[MIN199501, MIN200301].
Hence, in summary, an IPTV system must provide, as a minimum, the same service
and content as extant cable networks, including broadcast television, local channels,
premium channels, Pay Per View (PPV), music, and Personal Digital Recorder (PDR2)
services. SD and HD channels must be supported. Very high availability (no less than
99.999%) must also be supported.
With the significant erosion in revenues from traditional voice services on
wireline-originated calls (both in terms of depressed pricing and a shift to VoIP
over broadband Internet services delivered over cable TV infrastructure), and with the
transition of many customers from wireline to wireless services, the traditional
telephone carriers find themselves in need of generating new revenues by seeking
to deliver video services to their customers. Traditional phone carriers
find themselves challenged in the voice arena (by VoIP and other providers);
their Internet services are also challenged in the broadband Internet access arena
(by cable TV companies); and, their video services are nascent and challenged by
a lack of deployed technology. Multimedia (and new media) services are a way to
improve telco revenues.
Table 11.1 depicts the gains in ARPU that can be achieved by adding linear
programming and nonlinear programming (e.g.,VoD and PDR services). Line rentals
and voice calls contribute an ARPU of $38; adding broadband brings
the figure to $68. Adding video services increases this figure by $52, to a total
of $120.
IPTV is in the early stage of technical and market development worldwide. The
U.S. telco IPTV market opportunity is projected by market research firms to reach
2
These are also called Digital Video Recorders (DVRs) or Personal Video Recorders (PVRs).
236 IPTV APPLICATIONS
10 million subscribers by 2012 and about 65 million subscribers worldwide. That could
represent an IPTV services market of $6–$8 billion in the United States and $39–$52
billion worldwide by 2012; the Compound Annual Growth Rate (CAGR) is 35–40% for
the next 5 years according to some. Table 11.2 depicts some press time stats and
Table 11.3 depicts a forecast for the next few years.
According to press time data from Infonetics Research, worldwide IPTVequipment
revenue reached about $425 million in the first quarter of 2007. According to the same
market research, worldwide IPTVequipment manufacturer revenue increased over 150%
in 2006, passing the $1 billion mark. While the IPTV equipment market is expected to
grow at a more moderate pace in coming years, most equipment categories are forecast to
at least double or triple between 2006 and 2010 [INF200701]. It has been estimated that
to deploy the systems needed to deliver IPTV, carriers will spend in the range of
$20 billion between 2007 and 2012 in the United States alone. There appears to be a
favorable regulatory support in the United States for the entrance of the telcos into the
video field. However, telcos may face franchising rights challenges. It is worth noting
that historically the cable TV/pay TV market has been insensitive to economic cycles.
Table 11.4 provides a basic glossary of IPTV service concepts from Nortel
[NOR200601]; refer to the glossary at the back of the book for a more exhaustive listing.
100
Telco TV
80
Analog
U.S. TV 60 cable
subscriber
households
(in millions) 40 Digital
cable
20
0 Satellite
Note: This figure shows DSL delivery, likely ADSL2þ/VDSL, but a FTTH can also
be used.
Note: This figure does not show the middleware server either distributed at the telco
headend or centralized at the content aggregator.
Note: This figure does not show the content acquisition; the uniform transcoding
(e.g., using MPEG-4) is only hinted by the device at the far left.
Note: This figure does not show the specifics of how the Entitlement Control Message
(ECM) and Entitlement Management Message (EMM) to support the conditional access
function are distributed resiliently. This is typically done in-band for the ECMs and
out-of-band [e.g., using a Virtual Private Network (VPN) over the Internet] for the EMMs.
Note: This figure does not show the video-on-demand overlay is deployed over the
same infrastructure to deliver this and other advanced services.
Note: This figure does not show a blackout management system, which is needed to
support substitution of programming for local sport events.
Note: This figure does not show how the tribune programming data is injected
into the IPTV system, which is needed for scheduling/programming support.
Figure 11.2 is an architecture that is basically similar to that of Figure 11.1, but
the distribution to the remote telcos is done via a satellite broadcast technology.
Satellite delivery is typical of how cable TV operators today receive their signals
BASIC ARCHITECTURE 239
Conditional-
Access System
Firewal
Encryptor
Content 1
Dense
or Sparse-Dense
Content 2
DSLAM
Dense
or Sparse-Dense
Content 3
Content n
Encryptor
Firewal Dense
or Sparse-Dense
Conditional-
Access System
DSLAM
Control Word
Generator
DSLAM
from various media content producers (e.g., ABC/Disney, CNN, UPN, Discovery,
A&E). In the case of the cable TV/Multiple Systems Operators (MSOs), the operator
would typically have (multiple) satellite antenna(s) accessing multiple transponders
on a satellite or on multiple satellites and then combine these signals for distribution.
See Figure 11.3 for a pictorial example. In contrast, in the architecture of Figure 11.4,
the operator will need only one receive antenna because the signal aggregation
(conditional access, middleware administration, etc.) is done at the central point of
content aggregation.
Zooming in a bit, the technology elements (subsystems) involved in linear IPTV
include the following:
. Content aggregation
. Uniform transcoding
BASIC ARCHITECTURE 241
Conditional - rec
Access System
Control Word
Generator DSLAM
rec
Firewal
Encryptor
E Mo U HPA
Content 1
Mi
rec
Content 2
DSLAM
rec
Content 3
E Mo U HPA
Dense
or Sparce-Dense
Content n
Encryptor
rec
Firewal
Conditional -
Access System
DSLAM
Control Word
Generator rec
rec
E = Encapsulator DSLAM
Mo = Modulator rec
U = Up converter
HPA = High-Power Amplifier
MC = Mixer/combiner
rec = Receiver
. Conditional-access management
. Encapusulation
. Long-haul distribution
. Local distribution
. Middleware
. STBs
. Catcher (for VoD services)
Figure 11.3. Disadvantages of Distributed Source IPTV: Requires Need for Dish Farms at Each
Telco and for All Ancillary Subsystems
no one vendor has a true end-to-end solution. Hence, each of the following can be seen as
a subsystem platform in its own right:
. Content aggregation
. Uniform transcoding
. Conditional-access management
. Encapusulation
. Long-haul distribution
. Local distribution
. Middleware
. STBs
. Catcher (for VoD services)
BASIC ARCHITECTURE 243
Content
Providers
Source Receivers
IP Switch IP Encapsulation
DSLAM
Telco 1
IP Receiver
IP Router STB
Telco 2
DSLAM
Figure 11.4. Advantages of Single Source IPTV: Obviates Need for Dish Farms at Each Telco
Applications such as video are very sensitive to end-to-end delay, jitter, and
(uncorrectable) packet loss; QoS considerations are critical. These networks tend to
have fewer hops, and pruning may be somewhat trivially implemented by making use of a
simplified network topology.
MPEG2 Transport
4 bytes
MPE or
17 bytes
Ethernet
14 bytes
IP
20 bytes
IP Multicast UDP 7 TSs -- same PID
8 bytes
Output of encoders TS TS TS TS TS TS TS
MPEG2 Transport
4 bytes
Video or
Audio or
Data
184 bytes
Encoder 1
Encoder m
depicts the basic MPEG-2/4 framework. Each Elementary Stream (ES) output by an
MPEG audio, video, and (some) data encoders contain a single type of (usually
compressed) signal. There are a number of types of ES, including:
IP Packet
Encapsulator (for SAR function
IP
to MEPG-2 TS datagrams) 20 bytes
IP Multicast UDP
8 bytes 7 TSs -- same PID
IPE
MPEG-2 Transport Infrastructure TS TS TS TS TS TS TS
IP Packet Rcv
(Backbone Network)
IP 20 bytes Receiver (”tunes” to appropriate
IP Multicast UDP 8 bytes 7 TSs -- same PID frequency band and reassembles the
TS TS TS TS TS TS TS SARed TS datagrams into IP
Mullticase Packet)
Content Aggregation Site Telco HE site
Uncompressed
Video stream MPEG-2
Elementary Packetizer
Source MPEG
Encoder
encoded stream
Data Packetizer
Source
For video and audio, the information is organized into access units, each
representing a fundamental unit of encoding. For example, in video, an access unit
will usually be a complete encoded video frame. Each ES is an input to an MPEG-2
processor (e.g., a video compressor) which accumulates the data into a stream of
Packetized Elementary Stream (PES) packets. The compression is achieved using the
Discrete Cosine Transform (DCT). A PES packet may be a fixed- or variable-sized
block, with up to 65,536 octets per block, and includes a 6-byte protocol header. A PES
is usually organized to contain an integral number of ES access units. In MPEG-2
networks, an IP address must be associated with a PID3 and a specific transmission
multiplex.
Video compression is the basic enabler for IPTV and DVB-H. Compression—
decompression (codec) algorithms make it possible to capture, store, and transmit
digital video signals. Codec technology has continuously improved in the last decade.
One has
. industry standards, such as MPEG-2, MPEG-4 AVC (aka, MPEG-4 Part 10),
H.264/AVC [ITU Joint Video Team (JVT) with the International Organization
for Standardization (ISO)], and AVS (Chinese national video coding standard)
and
. proprietary algorithms, such as On2, Real Video, Nancy, and Windows Media
Video (WMV). WMV was originally a Microsoft proprietary algorithm that is
now also standardized by SMPTE as VC-1.
The most recent codecs, H.264/AVC and VC-1, represent the third generation of
video compression technology. Both codecs achieve very high compression ratios
utilizing the available processing power in low-cost Integrated Circuits (ICs). Compres-
sion ratios of 100:1–400:1, but with good quality, are desirable and are achievable.
Codecs now being developed by the ITU and MPEG include ITU/MPEG Joint Scalable
3
Some also call this the Program ID.
248 IPTV APPLICATIONS
T A B L E 11.6. Data Rates of Video, Which Mandates the Use of Compression Algorithms
Uncompressed Bit Rate (Mbps)
Picture Pixels Lines 10 fps B&W 10 fps Color 10 fps B&W 30 fps Color
SQCIF 128 96 0.98 1.47 2.95 4.42
QCIF 176 144 2.03 3.04 6.08 9.12
CIF 352 288 8.11 12.17 24.33 36.50
4CIF 704 576 32.44 48.66 97.32 145.98
16CIF 1408 1152 129.76 194.4 380.28 583.93
ITU-T
H.261 H.263 H.263+ H.263++
standards
Joint
ITU-T/MPEG
H.262/MPEG-2 H.264/MPEG-4 AVC
standards
Figure 11.8. Compression Standards That Have Evolved over the Years (Approximate Publica-
tion Date)
BASIC ARCHITECTURE 249
Content
provider
Content
provider
Content
provider
Receivers
Receivers
Receivers
Receivers
SDI SDI
MPEG-4 encoder
Layer 1 video MPEG-4 encoder
switching matrix
aka “video router” MPEG-4 encoder
MPEG-4 encoder
synchronized with the elementary streams (i.e., they are an independent control
channel).
Considerations for the encoder include:
. Bandwidth requirements for SD channels
. Ability to cap the variable bit rate, particularly for DSL delivery systems
. Bandwidth requirements for PIP for SD channels
. Types of audio/secondary audio channels to be supported
. Bandwidth requirements for audio
. Bandwidth requirements for HD channels
. Ability to cap the variable bit rate, particularly for DSL delivery systems
. Bandwidth requirements for PIP for HD channels
. Types of audio/secondary audio channels to be supported
250
T A B L E 11.7. Compression Applications/Characteristics
Design Concern
Tables 11.7 and 11.8 provide some basic information on encoder applications and
features.
More details on MPEG-2/4 are provided in Appendix 11.3B.
ECM keys are extracted from the multicast stream. ECMs control the decryption of
the events. To watch an event the subscriber must be entitled to watch the event (receive
the EMM) and be able to decrypt the scrambled event (have the latest ECM). EMMs
might be sent (but not always) via a separate path. Different CAS systems make use of the
ECMs/EMMs in somewhat different ways.
CW
CW
Control word ECM ECM Tx
encryption Mux Demux
generator
(router)
CA subsystem
Subscriber Subscriber
management authorisation
The CAS supports the following two types of IPTV traffic [NOR200601]:
One simplified example is shown in Figures 11.11 and 11.12. Here a code word
generator sends a newly generated key (generated every 15 s) for symmetric encryption
to the stream encryptor. The code word is also sent to an Entitlement Management
Messages Generator (EMMG) and Entitlement Control Messages Generator (ECMG).
The SMS (Service Management System) portion of the system manages subscriber
information, subscribers device information, and ordered program information. The
smart card is used for the management and control of subscriber and program delivery.
The ECMG/EMMG manages encryption and packaging of entitlements and controls
words for scrambled service. The EMMG has a database of preregistered smart cards,
which tells it if the card that is associated with a specific STB is entitled to the overall
program/service (where program is a collection of individual video channels). The ECM
contains a scrambled version of the code word. In this system, the STB is equipped with a
smart card. The client on the STB uses the arriving EMMs to update entitlement
information on the smart card and keys in the arriving ECMs to decrypt the IP streams
comprising the program/service. These messages enable or disable a clients access to
privileged data content coming through the CAS.
In this example ECMs contain the code word; the data required by the smart card or
SoftClient to decrypt the content are sent in ECMs. Control words change on a regular
basis and so an ECM always contains two ECMs (now and next) to ensure continuous
Encryptor Multicast
Data
CW (encryption
key)
ECMs 227.92.12.8:52128
CWS EMMs 227.92.12.8:52129
Softclient
(Code Word Generator)
CW (encryption
key)
Subscription Management
(ECM & EMM Generator) Smart Card
(Service Management System)
Encryptor
content 2
content 3
E
rec
Mc U HPA
content n DSLAM
rec
Encryptor smart card
smart card
Firewal
E = Encapsulator
Control Word smart card
Mo = Modulator
Generator
U = Up Converter
Conditional- HPA = High-Power Amplifier
Access System rec = Receiver
viewing. ECMs are specific to the scrambled content and are the same for all subscribers.
EMMs are used to deliver the entitlements to, and manage, the entitlements on the
Smart Card or SoftClient. EMMs are subscriber-specific, and can be divided into three
categories:
The code words are generated randomly and are changed frequently. ECM and
EMM messages are encrypted with the multilevel keys.
. DVB-CSA, Standard Ref: ETR 289, Edition: 1.0, Support for Use of Scrambling
and Conditional Access within Digital Broadcasting Systems
. Standard Ref: TR 102 035, Edition: 1.1.1, Implementation Guidelines of the DVB
Simulcrypt Standard
. Standard Ref: ETSI TS 101 197, Edition: 1.2.1, DVB SimulCrypt: Headend
Architecture and Synchronization
. Standard Ref: ETSI TS 103 197, Edition: 1.4.1, Headend Implementation of
SimulCrypt
Common Interface. In the mid 1990s the DVB came to the realization that a single
standard could not be agreed upon and thus settled for defining a common framework
within which different systems could exist and compete. They defined an interface
structure, the common interface, to allow the STB to receive signals from several service
providers operating different CAS (the common interface connector also allows plug-in
cards for other functions besides CA; e.g., it is proposed to provide audio description for
the visually impaired using a common interface card). Since then the European
Commission has required the use of a common interface mechanism for all integrated
TV sets (excluding STBs, which may employ embedded CAS) [DIG200701].
256 IPTV APPLICATIONS
Decoder A Decoder B
(CA X) (CA Y)
cw cw
Demux Demux
Signal
Scrambler CA sub-
CA sub-
system
system
Mux
Random no.
generator Tx
ECM X ECM Y
Msk X Msk Y
SimulCrypt. This allows two CAS to work side by side, transmitting separate entitlement
messages to two separate types of STBs, with different CAS. It also gives the multiplex
provider the opportunity to increase the viewer base by cooperating with other multiplex
operators. See Figure 11.13. If a viewer wishes to receive services from different
providers who do not simulcrypt each others ECMs, the only option is to acquire
separate decryption for each CAS. The common interface enables a MultiCrypt
environment, allowing an additional CA system to be added as a module. In practice,
the possibility of MultiCrypt encourages the parties to conclude a SimulCrypt agreement
[DIG200701].
The code word is encrypted using the service key, providing the first level of
protection. This service key may be common to a group of users, and typically each
4
This section is based on reference [BRO200701].
BASIC ARCHITECTURE 257
EMM ECM
Service key Control word
(encrypted with (encrypted with
user key 1) service key)
Service key
(encrypted with
user key 1)
Service key
(encrypted with
user key 1)
Figure 11.14. Embedding Code Words and Service Keys in ECMs and EMMs
encrypted service will have one service key. This encrypted code word is broadcast in an
ECM every few seconds and is what the decoder in the STB needs to descramble a
service.
Next, one has to make sure that authorized users (i.e., those who have paid) can
decrypt the code word, but that only authorized users can decrypt it. To do this, the service
key is itself encrypted using the user key. Each user key is unique to a single user, and so the
service key must be encrypted with the user key for each user that is authorized to view the
content. Once the system has encrypted the service key, it is broadcast as part of an EMM.
Since there is a lot more information to be broadcast (the encrypted service key must be
broadcast for each user), these are broadcast less frequently; each EMM is broadcast every
12–24 h, although some CASs broadcast them every half hour. See Figure 11.14.
Often (but not always) the encryption algorithm is symmetric, where the same key is
used for encryption and decryption in the case of the service and user keys. When the
receiver gets a CA message, it is passed to the CAS. In the case of an EMM, the receiver
will check whether the EMM is intended for that receiver (usually by checking the CA
serial number or smart card number); and if it is, it will use its copy of the user key to
decrypt the service key. The service key is then used to decrypt any ECMs that are
received for that service and recover the control word. Once the receiver has the correct
control word, it can use this to initialize the descrambling hardware and actually
descramble the content.
While not all CAS use the same algorithms (and it is impossible to know, because
technical details of the CA algorithms are not made public), they all work in basically
the same way. There may be some differences, and the EMMs may sometimes be used
for other CA-related tasks besides decrypting service keys, such as controlling the
pairing of a smart card and an STB so that the smart card will work correctly in that
receiver.
In order to generate the EMMs correctly, the CAS needs to have some information
about which subscribers are entitled to watch which shows. The SMS is used to set which
channels (or shows) an individual subscriber can watch. This is typically a large database
of all the subscribers that is connected to the billing system and to the CA system and is
258 IPTV APPLICATIONS
used to control the CAS and decide which entitlements should be generated
for which users. The SMS and CAS are usually part of the same package from the
CA vendor.
The ECMs and EMMs are broadcast as part of the service. The PIDs for the CA data
are listed in the Conditional Access Table (CAT), and different PIDs can be used for the
ECMs and EMMs. This makes it easier for remultiplexing, where some of the CA data
(the ECMs) may be kept, while other data (the EMMs) may be replaced.
A DVB receiver may contain several descrambling modules, each of which takes a
transport stream as input. Each module is logically the same, as we have described above,
but different modules may be capable of handling different CASs. The DVB-CI allows a
receiver to swap CA modules by defining a standard interface for the CAS. The DVB-CI
uses a PCMCIA interface for the CA module—any module that complies with the
DVB-CI specification will work in any receiver equipped with a DVB-CI slot. DVB-CI is
not the only standard interface for scrambling systems; the OpenCable POD interface is
also in use and in some cases is more popular that DVB-CI.
While NDS and Nagravision are the two most common CASs on the market at press
time, other CASs are provided by Conax, Irdeto Access, Verimatrix, Widevine, Philips
(the CryptoWorks system), and France Telecom (the Viaccess system). There are other
systems from companies, such as Motorola, who make CASs, but these are not often used
in DVB systems. DVB systems can offer pluggable encryption modules using the
DVB-CI, which uses a PCMCIA card to contain the encryption hardware and software.
This means that the user can switch encryption systems (for instance, if they change
their cable company) without having to replace the entire STB. This is a big advantage for
open standards and really enables the move from a vertical market to a horizontal one.
Some companies (NDS, for instance) are reportedly not convinced of the security of
the DVB-CI system, and so not all CASs are available as CI modules. The Advanced
Television Systems Committee (ATSC) uses a similar, though slightly more secure,
mechanism called the POD (Point of Deployment) module, known as CableCARD in
OpenCable systems. These are more widely deployed in U.S. markets, and all OCAP
receivers will include a CableCARD slot.
Figure 11.15. Encapsulation of a Subnetwork IPv4 or IPv6 PDU to Form an MPEG-2 Payload
Unit
MPE is a scheme used in DVB that encapsulates PDUs; each Section is sent in a
series of TS packets using a single TS logical channel [ETS200301]. The MPEG-2 TS
has been widely accepted not only for providing digital TV services but also as a
subnetwork technology for building IP networks. Examples of systems using MPEG-2
include the DVB and ATSC standards for digital television. To make use of an MPEG-2
TS environment, a network device, known as an encapsulator,5 receives PDUs (e.g., IP
packets or Ethernet frames) and formats these into Subnetwork Data Units (SNDUs).
An encapsulation (or convergence) protocol transports each SNDU over the MPEG-2
TS service and provides the appropriate mechanisms to deliver the encapsulated PDU
to the receiver IP interface. In forming an SNDU, the encapsulation protocol typically
adds header fields that carry protocol control information, such as the length of SNDU,
receiver address, multiplexing information, payload type, and sequence numbers. The
SNDU payload is typically followed by a trailer which carries an integrity check [e.g.,
Cyclic Redundancy Check (CRC)]. When required, an SNDU may be fragmented
across a number of TS packets. See Figures 11.15 and 11.16 [RFC4259]. Examples of
existing encapsulation/convergence protocols include AAL5 an ITU standard)
and MPEG-2 MPE (an ETSI/DVB standard). In summary, the standard DVB
way of carrying IP datagrams in an MPEG-2 TS is to use MPE; with MPE each IP
datagram is encapsulated into one MPE section. A stream of MPE sections is then put
into an ES, that is, a stream of MPEG-2 TS packets with a particular PID. Each MPE
section has a 12-B header, a 4-B CRC (CRC-32) tail, and a payload length, which is
identical to the length of the IP datagram, which is carried by the MPE section
[FAR200601].
+-----------------------------------------+
| Encap Header|Subnetwork Data Unit |
+-----------------------------------------+
Figure 11.16. Encapsulation of a PDU (e.g., IP Packet) into a Series of MPEG-2 TS Packets. Each TS
Packet Carries a Header with a Common ID Value Denoting the MPEG-2 TS Logical Channel
5
This is also known as an IP Encapsulator (IPE)
260 IPTV APPLICATIONS
IP over IP over
MPEG-4 Ethernet MPEG-2 TS
Digital Video ASI ASI DVB-S2
1000BaseT
Multiplexer Modulator
IP encapsulator (IPE)
261
262 IPTV APPLICATIONS
10–2
10–3
10–4 No coding
Code rate = 7/8
Code rate = 3/4
10–5 Code rate = 1/2
Coding gain
10–6
10–7
10–8
2 3 4 5 6 7 8 9 10 11 12
Eb/No, dB
Table 11.9 shows the DVB specs for Eb/No required for various modulation and
coding rates (includes RS); Turbo values are from vendor specs; the new DVB-S2
standard specifies LDPC coding in lieu of Turbo codes, which are slightly better than the
values shown in the table for turbo. An Eb/No of 6–9 dB (and a C/N—carrier-to-noise
ratio—of 12 dB) is desirable. In this satellite-based design, jitter is nonexistent.
A set of international standards for digital TV has been developed by the DVB
Project, an industry consortium with about 300 members, and published by a Joint
Technical Committee (JTC) of ETSI, CENELEC, and European Broadcasting Union
(EBU). These are collectively known as Digital Video Broadcast. IPTV makes use of a
number of these standards, particularly when making use of satellite links. Standards
have emerged in the past 10 years for defining the physical layer and data-link layer of a
distribution system, as follows:
Devices interact with the physical layer via a Synchronous Parallel Interface
(SPI), Synchronous Serial Interface (SSI), or ASI. All information in DVB is
transmitted in MPEG-2 transport streams with some additional constraints
(DVB-MPEG).
DSLAM
Conditional-
access system
Metropolitan core
Content 1
DSLAM
Backbone core
Content 3
Content n
Conditional-
access system
Metropolitan core
DSLAM
265
Figure 11.20. Distribution Networks
266 IPTV APPLICATIONS
devices in the network. It is also important to keep multicast traffic from “splashing
back” and flood unrelated ports. IGMP snooping and other techniques may be
appropriate.
Keep in mind that the end-off point at the headend, the demark between the IPTV
content provider and the telco, will need administrative and management connectivity
back to the CAS (for the EMMs) and also for the middleware function when that is done
in a centralized fashion.
BASIC ARCHITECTURE 267
User Management The middleware provides the dynamic User Interface (UI) to the
STB. It also keeps trackof the STB securityID, as well as maintaining the STB usage
log for billing and security purposes. It provides the Web-based STB remote control
capability. The STBs can also be remotely upgraded. The UI of the STB may be
customized and assigned to individual users or multiple user groups.
Content Management The software provides flexible control over that content. For
example, one can apply different costs to different content types, decide on the
usage pattern, or create different packages for different content groups.
Infrastructure Management The middleware allows the administrator to configure
server roles and functions and control input and output parameters. The middle-
ware provides manual and automated tools for load balancing. It can monitor the
availability of all servers and can activate a failover mechanism in the event of
finding a component failure.
Advertisements Entitlement
schedules/ server/ Profiling Session Playlists
Ad exchange
Campaigns Subscriber resource
targeting
management manager
engines
system Video server
Ad exchange
advertising Targeting results
campaign VPN
manager Targeting results
variety of TV interfaces. The STB must decrypt the incoming signal; therefore, it must
have access to the CAS both at provisioning time as well as in real time to process the
ECMs and EMMs. It needs a chipset to run the decryption. Also, it must be able to take IP
packets that contain the encapsulated MPEG-4 TS and convert that into a displayable
video signal by supporting the deencapsulation and the decoding function. Importantly,
the STB must support the middleware function. The middleware supports the user
interface and the ability to navigate the EPG, order new channels, VoD, DVR, etc. Also, it
must support SAP, closed captions, EAS (Emergency Alert System), parental controls,
picture-in-picture display, and multistream viewing (e.g., support three separate TV sets
in three separate rooms—this is sometimes done by packaging three units into one
chassis and using short-distance RF signaling to support the two secondary room remote
Controls). Typical STB manufactures include, but are not limited to, Cisco Scientific
Atlanta, Amino, LG Electronics, Motorola, Philips, Samsung, and Thomson
Multimedia.
. VoD Server This device stores the content. Some VoD servers are proprietary
hardware; others are software that works with commercial server and storage
equipment.
. VoD Catcher This equipment receives new content from the content provider
(such as a movie studio) via satellite. The brand/model of a VoD catcher is
specified by the content provider to work with the “VoD pitcher” used by the
content provider since there are no industry standards. An operator may need
several VoD catchers to receive content from multiple sources.
. VoD Cache The VoD cache is a distributed VoD server. Frequently accessed
content may be pushed to caching equipment located closer to the user.
270
APPENDIX 11.B 271
+-+-+-+-+-------------------+----+---+
|T|V|A|O| | |
|e|i|u|t| | S |
|l|d|d|h| IP | I |
|e|e|i|e| | |
|t|o|o|r| | T |
|e| | | | | a |
|x| | | | +---+----+-+ b |
|t| | | | | | MPE | l |
| | | | +--+--+ +------+ e |
| | | | | AAL5 |ULE | | |
Priv.
+-+-+-+-+------+ | Sect. +-+--+--+
| PES | ATM | | | Section |
+--------------+---+-----+-------+
| MPEG--2 TS |
+------------------------- -- ---- -+
6
ISO/IEC MPEG-4 Part-2 video specification does not specify the design of an encoder. It defines only the
syntax and semantics of a coded bit stream. Therefore, an encoder has proprietary design.
272 IPTV APPLICATIONS
The most basic component in MPEG is the elementary stream. A program (e.g., a
television video program) typically contains a combination of ESs (say, one for video,
one or more for audio, one for control data). Each ES generated at the output of an MPEG
audio encoder and an MPEG video encoder contains a single type of (usually com-
pressed) signal. For video and audio, the information is organized into access units, each
representing a fundamental unit of encoding; for example, in video, an access unit usually
is a complete encoded video frame [FAI200101]. A packetization function (typically
within the encoder stage) accumulates the data into a stream of packetized elementary
stream packets. See Figure 11.B2.
A PES datagram is a fixed (or variable) sized block, with up to 65,536 bytes per
block. A PES is usually organized to contain an integral number of ES access units.
Figure 11.B3 depicts the PES format.
The MPEG-2 standard defines two ways for multiplexing different elementary
stream types: (i) program stream, and (ii) transport stream. See Figure 11.B4.
An MPEG-2 program stream (MPEG-2 PS) is principally intended for storage and
retrieval from storage media. It supports grouping of video, audio, and data elementary
streams that have a common time base. Each PS consists of only one content (TV)
program. The PS is used in error-free environments; for example, DVDs use the
MPEG-2 PS. A PS is a group of tightly coupled PES packets referenced to the same
time base.
An MPEG-2 transport stream (MPEG-2 TS) combines multiple PESs (which may
or may not have a common time base) into a single stream and multiplexes these PESs
into one stream, along with information for synchronizing between them. At the same
time, the TS segments the PES into the smaller fixed-size TS packets. An entire video
frame may be mapped in one PES packet. PES headers distinguish PES packets of
various streams and also contain time stamp information. PESs are generated by the
packetization process; the payload consists of the data bytes taken sequentially from
the original ES. There are some constraints for forming TS packets: (i) The first byte of
the PES packet must be the first byte of the transport packet payload and (ii) each
transport packet must contain data from only one PES packet.
A TS may correspond to a single TV program; this type of TS is normally called a
Single-Program Transport Stream (SPTS). In most cases one or more SPTSs are
combined to form a Multiple-Program Transport Stream (MPTS). This larger aggre-
gate also contains the control information [Program-Specific Information (PSI)]
required to coordinate the DVB system and any other data that is to be sent
[FAI200101].
As noted, the TS consists of a stream of short fixed-length TS packets. A TS is
intended for non-error-free environments (i.e., environments that entail a transmission
link). The MPEG-2 TS packet length is 188 bytes long7 (4-byte header þ adaptation
field, or payload, or both); hence, each packet comprises 184 bytes of payload and a
4-byte header. A TS packet starts with a TS header of 4 bytes, followed by 184-byte
Adaptation field (a header extension) and payload (information section).
7
The MPEG TS packet size equates to eight ATM cells, assuming 8 B overhead from the ATM Adaptation
Layer (AAL).
Uncompressed
stream MPEG-2 ES
Video elementary Packetizer
encoder MPEG
PES
encoded
(compressed) stream Systems Layer
multiplexer
PES and SAR
Uncompressed (segmentation/ Transport Stream
stream MPEG-2 ES
Audio elementary Packetizer reassembly) TS
encoder MPEG
encoded
(compressed) stream PES
Data ES Packetizer
273
274 IPTV APPLICATIONS
PES Indicators (provide additional information about the stream to assist the decoder at
the receiver):
188 bytes
4 bytes
PID
cotinuity_counter Elementary-stream-priority indicator
transport_priority
adaptation_field_control Random access indicator
payload_unit_start_indicator
transport_scrambling_control Nondiscontinuity indicator
transport_error_indicator
Sync_byte Adaptation field length
A key field in the TS header, which plays the important role in the downstream use of
the TS, is the 13-bit PID. The PID8 determines to which program a TS packet belongs to,
and the PID is also unique for each program. In MPEG-2 systems, TS logical channels are
identified by their PIDs and provide multiplexing, addressing, and error reporting. The
PID value is a 13-bit field; thus, the number of available channels ranges from 0 to 8191
decimal (0 · 1FFF in hexadecimal), some of which are reserved for transmission of PSI
(also called SI) tables. Nonreserved TS logical channels may be used to carry audio,
video, IP packets, or other data. The value 8191 decimal (0 · 1FFF) indicates a null
packet that is used to maintain the physical bearer bit rate when there are no other
MPEG-2 TS packets to be sent [RFC4259]. The Adaptation field supports various
options and it may or may not be present. The presence of an Adaptation field is indicated
by the Adaptation field control bits in a TS packet. If present, the Adaptation field directly
follows the 4-byte packet header, before any user payload data; it contains a variety of
information used for timing and control, including the Program Clock Reference (PCR)
field. Byte stuffing is used to ascertain that the TS packet is always 188 bytes long; any
remainder portion of the TS packet payload is stuffed with bytes with value 0 · FF (in
some instances the Adaptation field is present only to provide the stuffing function). See
Figure 11.B5. The MPEG-TS is not a time division multiplex function because packets
with any PID may be inserted into the TS at any time by the TS multiplexor. If no packets
are available at the multiplexor, it inserts null packets (denoted by a PID value of
0 · 1FFF) to retain the specified TS bit rate [FAI200101]. By comparison with PS, with
TS it is easy to detect start and end of frames; it is also easy to recover from packet loss/
corruption. However, it is more difficult to produce and demultiplex than the PS.
Mapping functions are required to relate TS logical channels to IP addresses, to map
8
There are similarities between the way PIDs are used and the operation of virtual channels in ATM. However,
unlike ATM, a PID defines a unidirectional broadcast channel and not a point-to-point link. Contrary to
ATM, there is, as yet, no specified standard interface for MPEG-2 connection setup or for signaling.
276 IPTV APPLICATIONS
TS logical channels to IP-level QoS, and to associate IP flows with specific subnetwork
capabilities.
Figure 11.B6 puts the PES and TS concepts together. Two options are available for
inserting a PES packet into the TS packet payload [FAI200101]:
1. Carry only one PES (or part of a single PES) in a TS packet. This is the simplest
environment from both the encoder and receiver viewpoints. This allows the TS
packet header to indicate the start of the PES, but since a PES packet may have an
arbitrary length, it also requires the remainder of the TS packet to be padded,
ensuring correct alignment of the next PES to the start of a TS packet.
2. In general, a given PES packet spans several TS packets, so that the majority of
TS packets contain continuation data in their payloads. When a PES packet
begins, the payload_unit_start_indicator bit is set to 1, which means the first byte
of the TS payload contains the first byte of the PES packet header. Only one PES
packet can start in any single TS packet. The TS header also contains the PID so
that the receiver can accept or reject PES packets at a high level without
burdening the receiver with excessive processing. This approach, however, has
an efficiency impact on short PES packets.
The TS stream also includes PSI information. PSI consists of transport packets used by
the decoder to acquire information about the TS. The tables consist of a description of the
ESs that need to be combined to build programs and a description of the programs. Each
PSI table is carried in a sequence of PSI sections; the length of a section allows a decoder to
identify the next section in a packet. Tables are sent periodically by including them in the
transmitted transport multiplex [FAI200101]. PSI tables include:
To identify the required PID to demultiplex a particular PES, the remote device
searches for a description in the PAT. The PAT lists all programs in the multiplex; each
content program is associated with a set of PIDs (one for each PES) that correspond to a
PMT carried as a separate PSI section. There is one PMT per program. DVB also adds a
number of additional tables. See Figure 11.B7 for a pictorial example.
PES PES
Hdr PES packet payload Hdr PES packet payload
TS TS TS TS
Hdr TS payload Hdr TS payload Hdr TS payload Hdr TS payload
Adaptation field
(used for stuffing here)
188 bytes
4 bytes
277
278
Figure 11.B7. Single-Program Transport Stream (Video, Audio, and PSI PES)
APPENDIX 11.B 279
As implied pictorially in Figure 11.B7, TSs consist of a number of related ESs (e.g.,
the video and audio of a TV program), where the decoding of the ESs requires
synchronization to ensure that the audio playback is aligned with the corresponding
video frames. Time stamps are typically sent in the transport stream. Threre are two types
of time stamps:
Figure 11.B8. Example Showing MPEG-2 TS Logical Channels Carried over Two MPEG-2 TS
Multiplexes
In all cases, the final result is a TS multiplex that is transmitted over the physical
bearer toward the receiver.
Packet data for transmission over an MPEG-2 transport multiplex is passed to an
encapsulator, sometimes known as a gateway. This receives PDUs such as Ethernet
frames or IP packets and formats each into an SNDU by adding an encapsulation header
and trailer. The SNDUs are subsequently fragmented into a series of TS packets. To
receive IP packets over an MPEG-2 TS multiplex, a receiver needs to identify the specific
TS multiplex (physical link) and also the TS logical channel (the PID value of a logical
link). It is common for a number of MPEG-2 TS logical channels to carry SNDUs;
therefore, a receiver must filter (accept) IP packets sent with a number of PID values and
must independently reassemble each SNDU.
A receiver that simultaneously receives from several TS logical channels must filter
the other unwanted TS logical channels by employing, for example, specific hardware
support. Packets for one IP flow (i.e., a specific combination of IP source and destination
addresses) must be sent using the same PID. It should not be assumed that all IP packets
are carried on a single PID, as in some cable modem implementations, and multiple PIDs
must be allowed in the architecture. Many current hardware filters limit the maximum
number of active PIDs (e.g., 32), although if needed, future systems may reasonably be
expected to support more.
APPENDIX 11.B 281
Figure 11.B9. An Example Configuration for a Unidirectional Service for IP Transport over
MPEG-2
In some cases, receivers may need to select TS logical channels from a number of
simultaneously active TS multiplexes. To do this, they need a multiple physical receive
interfaces [e.g., radio frequency (RF) front ends and demodulators]. Some applications
also envisage the concurrent reception of IP packets over other media that may not
necessarily use MPEG-2 transmission.
resolution video, such as Common Intermediate Format (CIF9) (352 · 288 4:2:0 at
30 frames/second), utilized in video streaming applications, requires over 36 Mbps.
Direct delivery of these video streams is not possible except on a true fiber-based FTTH
system—since this data rate is many times more than what can be achieved over DSL
systems. Furthermore, storing one 90-min NTSC video movie in the uncompressed
4:2:2 YCrCb form requires over 110 Gbytes—this is not possible on consumer media
such as DVD-R since this data is 25 times the storage capability of a standard DVD-R.
Obviously, compression is needed to store or transmit digital video. A compression of
ratio of 100 : 1 (for IPTV) and 400 : 1 (for DVB-H) is sought; one wishes to encode digital
video by using as few bits as possible while maintaining visual quality. Typical factors to
consider when selecting the codec for an application include the visual quality require-
ments for the application, the environment (speed, latency, and error characteristics) of
the transmission channel or storage media, the desired resolution, the target bit rate, the
color depth, the number of frames per second, whether the content and/or display are
progressive or interlaced, and the cost of real-time implementation of the encoding and
decoding. Typically newer algorithms, such as H.264/AVC, achieve higher compression
but require increased processing, which can impact the cost for encoding and decoding
devices, system power dissipation, and system memory [GOL200601]. For IPTV, the
industry is settling on MPEG-4 at 2.5–3 Mbps for SD, 8–11 Mbps for HD, and 384 kbps
for DVB-H.
Two key standards organizations that have defined video codecs over the years are:
On occasions, the groups have worked together, such as in the Joint Video Team
(JVT), to define the MPEG-4 Part 10. ITU and MPEG continue to define new standards
for improved efficiency; for example, standards being developed at press time included
ITU/MPEG Joint Scalable Video Coding (an amendment to H.264/AVC) and MPEG
Multiview Video Coding. H.264 also recently defined a new mode, Fidelity Range
Extension, to address professional digital editing, HD-DVD, and lossless coding
applications. In addition to industry standards, a number of vendor-developed solutions
have emerged in recent years, especially for Internet-based streaming media applica-
tions. These proprietary systems include Real Networks Real Video (RV10), Microsoft
Windows Media Video 9 (WMV9) series, ON2 VP6, and Nancy. In 2003, Microsoft
proposed to the Society for Motion Picture and Television Engineers (SMPTE) that the
WMV9 bitstream and syntax be standardized, which is now the SMPTE VC-1 standard.
These codecs, however, are not generally used in IPTV applications.
9
Common Intermediate Format is a set of standard video formats used principally in videoconferencing
applications. The original CIF [also known as Full CIF (FCIF)] has a resolution of 352 · 288; QCIF—
Quarter CIF has a resolution 176 · 144; SQCIF—Subquarter CIF (resolution 128 · 96); 4CIF—4 · CIF has
a resolution 704 · 576; 16CIF—16 · CIF has a resolution 1408 · 1152.
APPENDIX 11.B 283
NTSC NTSC
Matrix Encoder Decoder Matrix
R Y Y R
G R –Y NTSC R –Y G
Transmission
B B –Y B –Y B
(see Figure11.B10). Hence, in a composite signal, which is the signal used for traditional
analog TV distribution:
. RGB components from the camera are generally translated to a set of color
difference components (such as Y, R Y, B Y) before being encoded to NTSC
or PAL for transmission (in modern equipment all these operations may take place
in the camera).
. The composite signal must be decoded in the receiver to a color difference format
and then translated to RGB for display.
In noninterlaced video scan, lines are displayed sequentially down the display unit
(that is, display line 1, 2, 3, 4, . . ., n). This is also called progressive scanning. In
interlaced video, alternate scan lines are displayed sequentially down the display, with
even fields being shown in one frame (lines 2, 4, 6, 8, . . ., n) and odd fields being shown in
the next frame (1, 3, 5, 7, . . ., n 1).
Similar concepts apply to digital TV. YCbCr is the component color space defined
by ITU-R BT.601 to support digital TV: Cr ¼ R Y and Cb ¼ B Y. The ranges of the
digital numbers are
. Y nominal range: 16–235
. Cb and Cr nominal range: 16–240 with zero corresponding to 128
. Y and CbCr components have different bandwidth and dynamic ranges
Scan
line
1
314
315
Scan
line
1
314
315
Y, Cb, Cr sample
Y sample only
Scan
line
1
314
315
Y, Cb, Cr sample
Y sample only
Scan
line
1
Y sample
Scan
line
1
Y sample
A typical IPTV codec has interlaced 4:2:2 input and outputs ESs or PESs. See
Figure 11.B16.
Next we discuss some basic constructs that are part of the MPEG coding schemes.
A video sequence is a sequence of frames that begins with a sequence header (may
contain additional sequence headers), includes one or more groups of pictures, and ends
with an end-of-sequence code. A Group of Pictures (GOP) is comprised of a header and a
series of one or more pictures intended to allow random access into the video sequence. A
picture is the primary coding unit of a video sequence. A picture consists of three
rectangular matrices representing luminance (Y) and two chrominance (Cb and Cr)
values. The Y matrix has an even number of rows and columns. The Cb and Cr matrices
are (usually) half the size of the Y matrix in each direction (horizontal and vertical)
(details on this below). A slice is one or more “contiguous” macroblocks (the order of the
macroblocks within a slice is from left to right and top to bottom). Slices are important in
the handling of errors—if the bit stream contains an error, the decoder can skip to the start
of the next slice. A Macroblock (MB) is a16-pixel by 16-line section of luminance
components and the corresponding 8-pixel by 8-line section of the two chrominance
components. Figure 11.B15 shows the spatial location of luminance and chrominance
components. A macroblock contains four Y blocks, one Cb block, and one Cr block.
Numbers correspond to the ordering of the blocks in the data stream, with block 1 first.
Each macroblock relates to 16 pixels by 16 lines of Y and the spatially corresponding 8
pixels by 8 lines of Cb and Cr. That is, a macroblock consists of four luminance blocks
288 IPTV APPLICATIONS
DCT/ VLC
Q/IQ
(4:2:2, interlaced)
IDCT
Video signal
(ES or PES)
Bit Stream
DCT/IDCT — Discrete cosine transform/
inverse discrete cosine transform
and two spatially corresponding color difference blocks: (i) Each luminance block thus
relates to 8 pixels by 8 lines of Y (16 · 16 Y pixels). (ii) Each chrominance block thus relates to 8
pixels by 8 lines of Cb or Cr, but these last are in 4:2:2 mode, resulting in 4 points for them in each
8 · 8 Y block (4:2:2 -> “8:4:4”, namely: 8 · 8 : 4 · 4 : 4 · 4) (8 · 8 Cr and 8 · 8 Cb). Figure 11.
B.17 illustrates some of these concepts.
A Cb and Cr diagram shows the relative x-y locations of the luminance and
chrominance components. Note that for every four luminance values, there are two
associated chrominance values: one Cb value and one Cr value. See Figure 11.B18;
note that the location of the Cb and Cr values is the same, so only one circle is shown
in the figure.
Video sequence
Group of pictures
... ...
Block
Picture
Slice Macroblock
8
pixels
8
pixels
Y Cb Cr
1 2 5 6
3 4
Intra pictures (I-pictures) are coded using only information present in the picture itself:
. I-pictures provide potential random access points into the compressed video data.
. I-pictures use only transform coding and provide moderate compression.
. I-pictures typically use about 2 bits per coded pixel.
Predicted pictures (P-pictures) are coded with respect to the nearest previous I- or
P-picture (technique is called forward prediction; see Figure 11.B19):
Bidirectional pictures (B-pictures) are pictures that use both a past and future picture as a
reference (technique is called bidirectional prediction; see Figure 11.B19):
. B-pictures provide the most compression and do not propagate errors because they
are never used as a reference.
. Bidirectional prediction also decreases the effect of noise by averaging two
pictures.
The MPEG algorithms allow the encoder to choose the frequency and location of
I-pictures; the choice is based on an applications need for random accessibility and
location of scene cuts in a video sequence (in applications where random access is
important, I-pictures are typically used two times a second). The encoder also chooses
290 IPTV APPLICATIONS
Forward prediction
I B B P B B P
Bidirectional prediction
I B B P B B P
the number of B-pictures between any pair of reference (I- or P-) pictures. The choice is
based on factors such as the amount of memory in the encoder and the characteristics of
the material being coded. A typical arrangement of I-, P-, and B-pictures, in the order in
which they are displayed, is shown in Figure 11.B20.
1
2B-pictures between 1-picture every 15th frame
reference (P)pictures (1/2 second at 30 Hz)
Picture type: I B B P B B P B B P B B P B B I B B P B B P B B P B B P B B
Display order: 1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Quanlizer VLC
+ DCT (Q) encoder
Motion Frame
compensation memory
motion
Motion
vectors
estimation
Error resiliency features added to support recovery in packet loss conditions include:
The MPEG-4 Advanced Simple Profile (ASP) starts from the simple profile and
adds B frames and interlaced tools (for level 4 and up) similar to MPEG-2. It also adds
quarter-pixel motion compensation and an option for global motion compensation.
MPEG-4 advanced simple profile requires significantly more processing performance
than the simple profile and has higher complexity and coding efficiency than MPEG-2.
MPEG-4 was used initially in Internet streaming and became adopted, for example,
by Apples QuickTime player. MPEG-4 simple profile is now finding usage in mobile
streaming applications.
improvement offered by H.264 creates new market opportunity, such as the following
possibilities:
. VHS-quality video at about 600 kbps. This can enable video delivery on demand
over ADSL lines.
. An HD movie can fit on one ordinary DVD instead of requiring new laser optics.
When H.264 was standardized, it supported three profiles—baseline, main, and
extended. Later, an amendment called Fidelity Range Extension (FRExt) introduced four
additional profiles referred to as the high profiles. Earlier, the baseline profile and main
profile generated interest the most. The baseline profile requires less computation and
system memory and is optimized for low latency. It does not include B frames due to its
inherent latency or CABAC due to the computational complexity. The baseline profile is
a good match for video telephony applications as well as other applications that require
cost-effective real-time encoding.
The main profile provides the highest compression but requires significantly more
processing than the baseline profile, making it difficult to use in low-cost real-time
encoding and low-latency applications. Broadcast and content storage applications
are primarily interested in the main profile to leverage the highest possible video quality
at the lowest bit rate.
While H.264 uses the same general coding techniques as previous standards, it has
many new features that distinguish it from previous standards, which combined improve
coding efficiency. The main differences are:
. Intraprediction and Coding—H.264 uses spatial domain of intraprediction to
predict the pixels in an intra-MB from the neighboring pixels in adjacent blocks.
The prediction residual along with the prediction modes is coded, rather than the
actual pixels in the block. This results in a significant improvement in intracoding
efficiency.
. Interprediction and Coding—Interframe coding in H.264 leverages most of the
key features in earlier standards and adds both flexibility and functionality,
including multiple options for block sizes for motion compensation, quarter-pel
motion compensation, multiple reference frames, generalized bidirectional pre-
diction, and adaptive loop deblocking.
. Variable Vector Block Sizes—Motion compensation can be performed using a
number of different block sizes. Individual motion vectors can be transmitted for
blocks as small as 4 · 4, so, up to 32 motion vectors may be transmitted for a single
MB in the case of bidirectional prediction. Block sizes of 16 · 8, 8 · 16, 8 · 8, 8 · 4,
and 4 · 8 are also supported. Smaller block sizes improve the ability to handle fine
motion detail and results in better subjective quality, including the absence of large
blocking artifacts.
. Multiple Reference Frame Prediction—Up to 16 different reference frames can be
used for interpicture coding, resulting in better subjective video quality and more
efficient coding. Providing multiple reference frames can also help make the
H.264 bit stream more error resilient. Note that this feature leads to increased
APPENDIX 11.B 295
memory requirement for both the encoder and the decoder since multiple
reference frames must be maintained in memory.
. Adaptive Loop Deblocking Filter—H.264 uses an adaptive deblocking filter that
operates on the horizontal and vertical block edges within the prediction loop to
remove artifacts caused by the block prediction errors. The filtering is generally
based on 4 · 4 block boundaries, in which up to three pixels on either side of the
boundary may be updated using a four-tap filter.
. Integer Transform—Previous standards that use DCT had to define rounding-
error tolerances for fixed-point implementations of the inverse transform. Drifts
caused by mismatches in the IDCT precision between the encoder and decoder
were a source of quality loss. H.264 gets around the problem by using an integer
4 · 4 spatial transform, which is an approximation of the DCT. The small 4 · 4
shape also helps reduce blocking and ringing artifacts.
. Quantization and Transform Coefficient Scanning—Transform coefficients are
quantized using scalar quantization with no widened dead zone. Different quanti-
zation step sizes can be chosen for each MB, similar to prior standards, but the step
sizes are increased at a compounding rate of approximately 12.5%, rather than by a
constant increment. Also, finer quantization step sizes are used for the chrominance
component, especially when the luminance coefficients are coarsely quantized.
. Entropy Coding—Unlike previous standards that offered a number of static VLC
tables depending on the type of data under consideration, H.264 uses a context-
adaptive VLC for the transform coefficients and a single universal VLC approach for
all the other symbols. The main profile also supports a new Context-Adaptive Binary
Arithmetic Coder (CABAC). The CAVLC is superior to previous VLC implementa-
tions but without the full cost of CABAC.
. CABAC—It uses a probability model to encode and decode the syntax elements,
such as transform coefficients and motion vectors. To increase the coding
efficiency of arithmetic coding, the underlying probability model is adapted to
the changing statistics within a video frame through a process called context
modeling. Context modeling provides estimates of conditional probabilities of the
coding symbols. Utilizing suitable context models, the given intersymbol redun-
dancy can be exploited by switching between different probability models,
according to already coded symbols in the neighborhood of the current symbol.
Each syntax element maintains a different model (e.g., motion vectors and
transform coefficients have different models). CABAC can provide up to about
10% bitrate improvement over CAVLC.
. Weighted Prediction—It forms the prediction for bidirectionally interpolated
macroblocks by using the weighted sum of forward and backward predictions,
which leads to higher coding efficiency when scene changes fade.
stereo-view video was introduced. The FRExt amendment introduced four new profiles to
H.264:
. High Profile (HP) for standard 4:2:0 chroma sampling with 8-bit color per
component.
. New tools were introduced for this profile; described in more detail below.
. High 10 Profile (Hi10P) for 10-bit color with standard 4:2:0 chroma sampling for
higher fidelity video displays.
. High 4:2:2 10-bit color profile (H422P) useful for source editing functions such as
alpha blending.
. High 4:4:4 12-bit color profile (H444P) for the highest quality source editing and
color fidelity supporting lossless coding for regions of the video and a new integer
color space transform (from RGB to YUV and back).
Among the new profiles, H.264 HP, which maintains 8-bit components and 4:2:0
chroma sampling, appears especially promising to the broadcast and DVD community.
Some experiments show as much as 3x gain for H.264 HP over MPEG-2. Below are the
key additional tools introduced in H.264 HP:
Another interesting feature is the ability to use an explicit fading compensating for
scenes involving fading. This improves the quality of motion compensation in these
scenarios. WMV9/VC-1 achieves significant performance improvements over MPEG-2
and MPEG-4 simple profile and has fared well in some perceptual quality rating
comparisons with H.264. WMV9 is also standardized as a compression option for the
upcoming HD-DVD and Blu-ray formats.
11.B.3.3 AVS
In 2002, the Audio-Video Standard Working Group established by the Ministry of
Information Industry of China announced an effort to create a national standard for
mobile multimedia, broadcast, DVD, etc. The video standard, referred to as AVS,
consists of two related parts, a AVS-M for mobile video applications and AVS1.0 for
broadcast and DVD. The AVS standards are similar to H.264. AVS1.0 supports both
interlaced and progressive modes. AVS allows the use of two previous reference frames
for P frames while allowing one future and one previous frame for B frames. In the
interlaced mode, up to four fields are allowed for reference. Frame/field coding in the
interlaced mode can be performed at a frame-level only, unlike H.264, where MB-level
adaptation of this option is allowed. AVS has a loop filter similar to H.264, which can be
disabled at a frame level. Also, no loop filter is required in B-pictures. The intrapredic-
tion is done on 8 · 8 blocks. MC allows up to 1/4 pel for luma blocks. The block sizes for
ME can be 16 · 16, 16 · 8, 8 · 16, or 8 · 8. The transform is a 16-bit based 8 · 8 integer
transform, similar to WMV9. VLC is based on context-adaptive 2-D run/level coding.
Four different Exp-Golomb codes are used. The code used for each quantized
coefficient is adaptive to the previous symbols within the same 8 · 8 block. Since
Exp-Golomb tables are parametric, table sizes are small. The visual quality of AVS 1.0
for progressive video sequences is marginally inferior to H.264 main profile at the same
bit rate. AVS-M is targeted especially at mobile video applications and overlaps with
H.264 baseline profile. It only supports progressive video, I and P frames, and no
B frames. The main AVS-M coding tools include 4 · 4 block-based intraprediction,
quarter-pel motion compensation, integer transform and quantization, context-
adaptive VLC, and a highly simplified loop-filter. Similar to H.264 baseline profile,
in AVS-M the motion vector block size can be down to 4 · 4, and consequently an MB
can have up to 16 motion vectors. Multiple frame prediction is used, but it only requires
up to two reference frames. A subset of H.264 HRD/SEI messages is also defined in
AVS-M. On average and with similar settings, the coding efficiency of AVS-M is about
0.3 dB, worse than the H.264 baseline profile, whereas decoder complexity is about
20% lower.
298 IPTV APPLICATIONS
The material in this section is based directly on a paper by Golston and Rao
[GOL200601].
+-+-+-+-+------+------------+---+--+--+---------+
|T|V|A|O| O | | O |S |O | |
|e|i|u|t| t | | t |I |t | |
|l|d|d|h| h | IP | h | |h | Other |
|e|e|i|e| e | | e |T |e |protocols|
|t|o|o|r| r | | r |a |r | native |
|e| | | | | | |b | | over |
|x| | | | | +---+----+-+ |l | |MPEG-2 TS|
|t| | | | | | | MPE | |e | | |
| | | | | +--+---+ +------+ | | | |
| | | | | | AAL5 |ULE|Priv. | | | | |
+-+-+-+-+---+------+ | +-+--+--+ |
| PES | ATM | |Sect. |Section| |
+-------+----------+---+------+-------+---------+
| MPEG-2 TS |
+---------+-------+----------------+------------+
|Satellite| Cable | Terrestrial TV | Other PHY |
+---------+-------+----------------+------------+
Mac_address_3
Mac_address_5
Mac_address_4
section_syntax_indicator Mac_address_6
LLC_SNAP_flag
table_id section_length
payload_scrambling_control Mac_address_2
address_scrambling_control
current_next_indicator
appropriate to both MPE and any alternative encapsulation method developed (MPE will
continue to be deployed in the future to develop new markets; any alternative encapsu-
lation would need to coexist with MPE). The developed protocols may also be applicable
to other multicast-enabled subnetwork technologies supporting large numbers of directly
connected systems.
An encapsulation method that has emerged recently is the Unidirectional Lightweight
Encapsulation (ULE) defined in RFC 4326. An ULE is layered direct on TS. This
approach is known as data piping and it is a new encapsulation method or mechanism for
the transport of IPv4 and IPv6 datagrams directly over ISO MPEG-2 TS as TS private data.
ULE also supports DVB architecture, the ATSC system, and other similar MPEG-2-based
transmission systems. ULE encapsulation does not add a lot of overhead for encapsula-
tion. The ULE header is much smaller and less complex than the MPE header
[HON200501].
REFERENCES
[ATS200001] A/90, ATSC Data Broadcast Standard, Advanced Television Systems Committee
(ATSC), Doc. A/090, 2000.
[BRD200701] Broadband Services Forum, IPTV Explained—Part 1 in a BSF Series, Fremont, CA,
www.broadbandserivcesforum.org.
[BRO200701] Broadcast Engineering Basics, http://www.interactivetvweb.org/tutorial/dtv-intro/
dtv-transmission.shtml.
REFERENCES 301
[RFC4259] RFC 4259, A Framework for Transmission of IP Datagrams over MPEG-2 Networks,
M-J. Montpetit, G. Fairhurst, et al., November 2005.
[SIP200701] Spirent Communications, High Quality Mobile TV: The Challenge for Operators to
Deliver High-Quality TV to Mobiles, White Paper, Spirent, Eatontown, NJ, February 2007.
12
DVB-H: HIGH-QUALITY TV
TO CELL PHONES
1
At press time an initiative by the newly formed Open Mobile Video Coalition (OMVC) sought to develop a
different technology intended to be used by the U.S. TVover-the-air broadcasters to deliver video to handheld
(cell phones) without using cellular technology. However DVB-H has been around for several years and it is
already a standard. The OMVC work targeted a standard by February 17, 2009, when analog TV broadcasting
ceases in the United States.
303
304 DVB-H: HIGH-QUALITY TV TO CELL PHONES
The cell phone/PDA screen is considered a “third” content delivery interface point by
many (the first is the TV screen, and the second is the PC screen). There is keen industry
interest in delivering video to cell phones as a way to support an unmet need and also
increase Average Rate per User2 (ARPU). Figure 12.1 depicts one industry forecast for
mDTV services in the United States.
mDTV is of interest to the entire wireless market—operators, handset OEMs,
infrastructure, and semiconductor providers, but for the DTV market in the United States
to take off, open standards must see implementation [TEX200701]. Initial mDTV
services use streaming video over the cellular network; the downside of this approach
is that it uses voice bandwidth and in so doing lowers the overall capacity of the network
for all users. The commercial breakthrough will arise from live broadcast TV. By design,
mDTV will offer high-quality live broadcast TV (20–30 frames per second; QCIF-QVGA
format) accompanied by full audio. Additional services will be available as a
menu/guide system and pay-per-view channels to enhance the viewing experience
[PIE200501]. Open standards offer advantages over proprietary technologies and
networks controlled by a single company; consumers benefit from the innovation and
less expensive devices. Portable handsets with relatively large, high-resolution Liquid
Crystal Display (LCD) screens, powerful CPUs, and long battery life provide users
viewing enjoyment along with freedom of movement. Compression/decompression
standards with reduced bandwidth requirements and acceptable quality video signals
have emerged in the past 15 years as discussed in the previous chapter. Also, fueling
expansion is the development of cellular networks from second generation (2G)
through third generation (3G) and soon to fourth generation (4G) [SIP200701].
1200.00
1000.00 Mobile video multicasting revenues ($ Mn)
800.00
600.00
400.00
200.00
0.00
2004 2005 2006 2007 2008 2009 2010
Year
Note: All figues are rounded; the base year is 2005. Source: Frost & Sullivan
Figure 12.1. Mobile Video Server Market: Revenues from Different Types of Mobile Video
Servers (U.S.) 2004–2010
2
Some also favor the expansion average rate per unit.
BACKGROUND AND MOTIVATION 305
. DMB (Digital Media Broadcast) has deployed today in Korea with several
handsets already in the market to support the standard and is expanding to Europe
and other parts of Asia.
. ISDB-T (Integrated Services Digital Broadcast-Terrestrial) is the standard in Japan.
. DVB-H is quickly gaining ground with trials in Europe, the United States, and
parts of Asia.
. 3G networks: While the economics and bandwidth requirements of streaming live
broadcasts over the cellular network are still being assessed at this time, the use of
3G networks to download clips or full television shows to PDA/cell-phone
memory is practical.3
. OMVC — see Appendix 12.A.
Asimplied above, thereare other“nonopen” technologies that have been developed for
mDTV, including MediaFLO. Table 12.1 provides a comparison [TEX200701]. This
chapter focuses on DVB-H. While there are number of frequency plans to support DVB-H,
especially internationally, in the United States, the 700–800-MHz area is now being
considered; this relieves (at least initially) the 3G bandwidth-limitation issues just noted for
video delivery to handhelds. DVB-H is an extension of the DVB-T standard. Additional
features have been added to support handheld and mobile reception. Lower power
consumption for mobile terminals and secured reception in the mobility environments
are key features of the standard. It is meant for IP-based services. DVB-H can share the
DVB-T MUX with MPEG-2/MPEG-4 services, so it can be part of the IPTVinfrastructure
described in the previous chapter, except that lower bit rates are used for transmission
(typically in the 384-kbps range). The content aggregation point is similar to that described
in Chapter 11, including the use of CASs. Since the middle of this decade, a number of
network operators, equipment providers, and content providers have conducted or are
conducting several DVB-H trials around the world.
DVB-H was published as ETSI standard EN 302 304 in November 2004. This
standard is an umbrella standard defining how to combine the existing (now updated)
ETSI standards to form the DVB-H system. The basic standards in DVB-H are as follows:
. ETSI EN 302 304: “Digital Video Broadcasting (DVB); Transmission System for
Handheld Terminals (DVB-H)”
3
One of the issues with 3G is bandwidth. One could recall from the recent discussions regarding Apples
release of the iPhone that the handheld (iPhone) got much better reviews than the (specific) carrier that one had
to subscribe to in order to use that device: the carrier did not have its 3G (high-speed) data network deployed
throughout the United States, which means that some of the key features (such as Web browsing) do not run as
fast as consumers may have hoped.
306 DVB-H: HIGH-QUALITY TV TO CELL PHONES
. Draft ETSI TR 102 377 V1.1.1 (2005-01): “Digital Video Broadcasting (DVB);
DVB-H Implementation Guidelines”
. Draft ETSI TR 159 r12: “Digital Video Broadcasting (DVB); DVB-H Implemen-
tation Guidelines”
Mobile DTV has the potential to positively impact the bottom line, ARPU, of dis-
tributors, multimedia content suppliers, and telcos. Specifically, there are a number of
business opportunities for mobile DTV, including the following [PIE200501]:
Carriers (telcos)
. DVB-H allows them to increase ARPU with a new service to existing customers
. Reduce churn rate
. Attract new customers with competitive services and channel offerings through
mobile DTV
. Gain additional revenue from interactive services and advertising
. Make deals with content providers or aggregators on their own to deliver content
to their subscribers.
Content providers/broadcasters
. Gain revenue with phone upgrades as mobile DTV services increases in popularity
. Develop new mobile phone designs that are small but deliver the performance and
screen resolution for crisp, clear images
Some, however, including this author, have a concern that in an attempt to add a
screen to a cell phone handset, the speaker portion of the handset is being severely
compromised in size. In some cell phone models, where the ear would naturally
line up when the cell phone is in use, there is instead a 200 · 200 screen with a trivial
speaker element placed literally on the rim (rather than center) of the upper shell
portion of the phone. This arrangement runs quite contrary to the physiognomy of
the ear, which would prefer the speaker closer to the center of the upper portion of
the phone. This design has the effect of making the voice quality really marginal,
especially for use in noisy environments (e.g., in a car) or for older users.
Manufacturers need to ascertain that the voice quality remains at an acceptable
level as they contemplate the video use of the cell phone. The potential use of a
Bluetooth earpiece does not invalidate this design concern, because the earpiece is
not always in use or is pragmatically inappropriate
Silicon vendors: Develop chip sets and software for mobile phones.
Software third parties: Deliver additional software and applications for mobile
phones.
Figures 12.2–12.4 provide basic graphical views of DVB-H/mobile DTV environ-
ments. Notice the CAS. The backbones make use of IP multicast. A DVB-H network is
typically a combination of [PIE200501]:
Delivery (transmission) mechanisms for cell phone reception fall into two categories
[SIP200701]:
DVB-H, DMB, and MediaFLO need to deal with bandwidth constraints because
while each video channel may be watched by many subscribers simultaneously, the
bandwidth available per channel is limited. These bandwidth constraints are mitigated
through the use of state-of-the-art compression standards that provide good-enough
quality for a handheld at 384 kbps. Because of the constrained data rates suggested for
individual DVB-H services and the small displays of typical handheld terminals, the
classical audio and video coding schemes used in digital broadcasting do not suit
DVB-H well; therefore, the DVB-H standard replaces MPEG-2 video with H.264/AVC
or other high-efficiency video coding standards [FAR200601]. As noted in the previous
Conditional-
Access System rec
Control Word
Generator
MPEG-2 TV Service rec
Firewall MPEG-2 TV Service -DVB-M signalling
MPEG-2 TV Service -8k or 2k
MPEG-2 TV Service
Encrytor MUX
DVB-H IPE Mc U HPA
Content 1 DVB-H IPE
M
rec
Content 2
Content 3 rec
-DVB-M signalling
DVB-H IPE Mc U HPA -8k or 2k
Dense
Content n or Sparce-Dense
309
310 DVB-H: HIGH-QUALITY TV TO CELL PHONES
Channel
DVB-H IPE
IP
MPE MPE FEC Time Slicing
DVB-H IPE
IP
MPE MPE FEC Time Slicing
CDMA2000 CDMA2000
(Mbit/s) 1xEVDO Rev. A Rev. B Rev. C (Mbit/s) WCDMA HSDPA HSUPA HSPA+
Down link 2.4 3.1 3.1-73 70-200 Down link 0.384 1.8-72 7.2 40
Data rate Data rate
Up link 0.153 1.8 1.8-27 30-45 Up link 10
Data rate Data rate 0.384 0.384 5.8
Note: Total cell (sector) bandwidth is comparatively small, typically 2–3 Mbps at the low end and 20 Mbps at the
higher end. This puts an upper limit on the total number of simultaneous subscribers who can watch video
(perhaps an absolute maximum of 100 or so). WiMAX is better for total system throughput (approximately
40–80 Mbps) but may reach a bottleneck when subscribers reach a few hundred.
Courtesy: Spirent Communications.
BASIC DVB-H TECHNOLOGY 311
DVB-H is seen as a “proven technology” because it is based on the DVB standard used in
Europe for terrestrial and satellite DTV transmission but has a low-power mode for
battery-powered devices. A DVB-H system is a combination of elements of the physical
and link layers, as well as service information. At the physical layer, it uses an OFDM air
interface technology and includes a technique for power reduction in the tuner. OFDM is
a good choice for mobile TV air interface because it offers good spectral efficiency,
immunity to multipath interference, and good mobile performance and works well in
single-frequency networks such as those planned for mobile TV. DVB-H uses time
slicing so that the tuner can be switched off most of the time and is only on during short
transmission bursts. This allows the tuner to operate over a reduced input bandwidth and
also conserves power. In the United States, DVB-H will be deployed using clear and
“ready-for-use” spectrum available today, without interfering with the existing analog
TV stations or other TV or wireless services [TEX200701].
The Digital Video Broadcast (DVB) Project started research work related to mobile
reception of DVB—Terrestrial (DVB-T) signals in 1998, accompanying the introduction
of commercial terrestrial digital TV services in Europe. In 2000, the EU-sponsored
Motivate (Mobile Television and Innovative Receivers) Project concluded that mobile
reception of DVB-T is possible, but it implies dedicated broadcast networks. It was
recognized that mobile services are more demanding in robustness (i.e., constellation and
312 DVB-H: HIGH-QUALITY TV TO CELL PHONES
coding rate) than broadcast networks planned for fixed DVB-T reception. Later in 2002,
the EU-sponsored Multimedia Car Platform (MCP) Project explored the behavior of
antenna diversity reception that by introducing spatial diversity in addition to the
frequency and time diversities provided by the DVB-T transmission layer improved
sufficiently reception performance to allow a mobile receiver to access DVB-T signals
broadcast for fixed receivers. While DVB-T shows sufficient flexibility to permit mobile
broadcast services, it is not ideally suited for these applications. As a consequence, in early
2002, the DVB community was asked to provide technical specifications to allow delivery
of rich multimedia contents to handheld terminals, a property that has been missing in the
original DVB-T. This would make it possible to receive TV-type services in a small,
handheld device like a mobile phone [FAR200601].
Handheld terminals (defined as a lightweight, battery-powered apparatus) require
specific features from the transmission system serving them, as defined in ETSI TR 102
377 V1.2.1 (2005-11):
. The transmission system must offer the possibility to repeatedly turn the power off
to some parts of the reception chain. This reduces the average power consumption
of the receiver.
. The transmission system must ensure that it is easy for receivers to move from one
transmission cell to another while maintaining the DVB-H service.
. For a number of reception scenarios (indoor, outdoor, pedestrian, and inside a
moving vehicle), the transmission system must offer sufficient flexibility and
scalability to allow the reception of DVB-H services at various speeds while
optimizing transmitter coverage.
. Since services are expected to be delivered in environments that suffer high levels
of human-made electromagnetic noise, the transmission system needs to offer the
means to mitigate their effects on the performance of the receiving terminal.
. Since DVB-H aims to provide a generic way to serve handheld terminals in various
parts of the world, the transmission system must offer the flexibility to be used in
various transmission bands and channel bandwidths.
As noted, the DVB-H system is defined based on the existing DVB-T standard for
fixed and in-car reception of digital TV; the main additional elements in the link layer
(i.e., the layer above the physical layer) are time slicing and additional FEC coding.
Figure12.5 depicts the DVB-H protocol stack.
DVB-H makes use of the following technological elements for the link and physical
layers:
. Link layer:
– Time slicing is used in order to reduce the average power consumption of the
receiving terminal and enable smooth and seamless frequency handover when
the user leaves one service area in order to enter a new cell. Time slicing reduces
the average power in the receiver front end up to about 90–95%. Time slicing is
mandatory for DVB-H. With DVB-H, a device has a need to receive audio/video
BASIC DVB-H TECHNOLOGY 313
IP
MPE-FEC frame MPE-FEC frame
191 columns 64 columns
MPE MPE-FEC
sections sections
MPEG-2 TS
Padding
RS data
Application data table
table (RS parity
(IP datagrams) bytes)
Encoder
RF
DVB-T Modulator Transmission
MPEG-2 TV Service TS
MUX 8k 4k 2k DVB-H TPS
DVB-H IP
L2S encapsulator
IP
MPE- Time
MPE
FEC Slicing
CAS
RF
Reception DVB-H IP Deencapsulator Display
DVB-T Demodulator TS IP
Time MPE-
MPE
8k 4k 2k DVB-H TPS Slicing FEC
. Physical layer: DVB-H makes use of DVB-T but with the following technical
elements specifically targeting DVB-H use:
– DVB-H signaling in the Transmission Parameter Signaling (TPS) bits to
enhance and speed up service discovery. A cell identifier is also carried in
the TPS-bits to support quicker signal scan and frequency handover on mobile
receivers. DVB-H signaling is mandatory for DVB-H.
– 4K mode for trading off mobility and SFN cell size, allowing single-antenna
reception in medium SFNs at very high speed, adding flexibility for the network
design. 4K mode is not mandatory for DVB-H.
BASIC DVB-H TECHNOLOGY 315
– In-depth symbol interleaver for the 2K and 4K modes to further improve the
robustness in mobile environments and impulse noise conditions. In-depth
symbol interleavers for 2K and 4K are not mandatory for DVB-H.
Hence, the physical layer has four extensions to the existing DVB-T physical layer
[FAR200601]:
1. The bits in TPS have been upgraded to include two additional bits to indicate
presence of DVB-H services and possible use of MPE-FEC to enhance and speed
up the service discovery.
2. A new 4K OFDM mode is adopted for trading off mobility and SFN cell size,
allowing single-antenna reception in medium SFNs at very high speeds. 4K
mode is an option for DVB-H complementing the 2K and 8K modes that are
available as well. The objective of the 4K mode is to improve network planning
flexibility by trading off mobility and SFN size.
3. A new way of using the symbol interleaver of DVB-T has been defined. For 2K
and 4K modes, the operator may select (instead of native interleaver that
interleaves the bits over one OFDM symbol) the option of an in-depth interleaver
that interleaves the bits over four or two OFDM symbols, respectively. This
approach brings the basic tolerance to impulse noise of these modes up to the
level attainable with the 8K mode and also improves the robustness in mobile
environment. To further improve robustness of the DVB-H 2K and 4K modes in a
mobile environment and impulse noise reception conditions, an in-depth symbol
interleaver has also been added to the standard.
4. The 5 MHz channel bandwidth to be used in nonbroadcast bands. This is of
interest, for example, in the United States, where a network of about 1.7 GHz is
running using DVB-H with a 5-MHz channel.
DVB-T only has 2K (1705 carriers) and 8K (6817 carriers) modes. DVB-H adds a
4K (3409 carriers) mode; it allows the designer to trade off mobility and SFN cell size.
See Figure 12.7. Enhanced in-depth interleavers (distributes burst errors over a larger
timescale so that FEC is able to correct the errors) in 2K and 4K modes. DVB-H is
backward compatible to DVB-T.
f f f
Each
carrier
QPSK,
16 QAM,
or
n n+2 64 QAM n n+2 n n+2
n+1 n+1 n+1
modulated
receiver to decode the wanted service and shut off during the other service bits. It
aims to reduce receiver power consumption while also enabling a smooth and
seamless frequency handover. The MPE-FEC module provided by DVB-H offers,
in addition to the error correction in the physical layer transmission, a comple-
mentary FEC function that allows the receiver to cope with particularly difficult
reception situations.
. The DVB-H terminal itself. The handheld terminal decodes/uses IP services only.
Note that the 4K mode and the in-depth interleavers are not available, for
compatibility reasons, in cases where the multiplex is shared between services
intended for fixed DVB-T receivers and services for DVB-H devices.
Power control
Time slicing
IP Datagrams
At press time, a coalition of almost 800 local stations in the United States were
working to bring live over-the-air broadcast TV to mobile television devices by 2008.
The broadcasting groups were working with the Digital TV (DTV) standard body
ATSC to create a new standard to allow broadcasters to transmit live video and
non-real-time data services to mobile phones and other handheld devices via their
existing digital TV spectrum without interfering with their current HD or SD
programming. The Open Mobile Video Coalition (OMVC) has established the goal
to have the technology standard in place by February 17, 2009, when analog TV
broadcasts will cease in the United States [DIC200701]. The OMVC, which has
enlisted financial support form the National Association of Broadcasters and technical
help from the Association for Maximum Service Television (MSTV), represents 422
commercial stations in 142 markets covering 103 million U.S. TV households as well
as 361 public TV stations. Members include station groups ION, Belo, Fox, Gannett,
Gray, NBC/Telemundo, Sinclair, Tribune, Cox, Dispatch, Freedom, LIN, Meredith,
Media General, Post-Newsweek, Raycom, Schurz, and the Association of Public
Television Stations.
The technical challenge is not trivial. It requires the transmission of robust signals
to small, portable devices within the existing 6-MHz DTV channel, without interfering
with the core programming services stations are already providing with their 19.4 Mbps
of digital throughput. The request for proposals that the ATSC circulated in mid-2007
asked for a system that could not only deliver live, advertiser-supported TV to
cellphones but also support subscription services, non-real-time download services
for on-demand playback, datacasting applications, interactive TV, and real-time
navigation data for automobiles. Preliminary proposals were submitted in 2007 by
10 companies and/or groups of companies, including Coding Technologies, Coherent
Logix, DTS, LG Electronics and Harris Corp., Mobile DTV Alliance, Micronas
Semiconductor, Nokia, Samsung Electronics Co., and Rohde & Schwarz, Thomson,
and Qualcomm. Some proposals related to full mobile DTV systems, such as MPH
(Mobile Pedestrian Handheld) from LG/Harris and A-VSB (Advanced-Vestigial Side
Band) from Samsung/Rohde & Schwarz, both of which were demonstrated at the NAB
show in Las Vegas in April 2007. There were two proposals being considered at press
time. (1) The MPH system, which LG and Harris formally documented in an 80-page
submission to the ATSC late in 2007, has been undergoing continual development since
it was first unveiled at the NAB show. LG has created an MPH receiver chip that will
allow it to soon demonstrate much smaller from-factor mobile devices than the “big
box” it used to demonstrate MPH in Las Vegas; such devices would have a single
antenna less than 3 inches long. (2) As an alternative proposal, Samsung was refining its
A-VSB system since first demonstrating it in a shuttle bus at the Consumer Electronics
Show in early 2007. The goal of OMVC was to reach a compromise, inclusive solution
and then proceed to implementation, in competition with DVB-H-based solutions
[DIC200701].
318 DVB-H: HIGH-QUALITY TV TO CELL PHONES
REFERENCES
[DIC200701] G. Dickson, Mobile TV Takes Flight, Broadcasting & Cable, November 12, 2007.
[FAR200601] G. Faria, J. A. Henriksson, E. Stare, P. Talmola, DVB-H: Digital Broadcast Services
to Handheld Devices, Proceedings of the IEEE, vol. 94, no. 1, January 2006, page 194.
[PIE200501] R. Pieck, DVB-H Broadcast to Mobile Devices, White Paper, Newtec America, Inc.,
Stamford, CT, www.newtecamerica.com, September 14, 2005.
[SIP200701] Spirent Communications, High Quality Mobile TV The Challenge for Operators to
Deliver High-Quality TV to Mobiles, White Paper, Spirent, Eatontown, NJ, February 2007.
[TEX200701] Texas Instruments, DVB-H Mobile Digital TV for the U.S., White Paper, Texas
Instruments, Dallas, TX.
GLOSSARY
319
320 GLOSSARY
A la carte VoD VoD where one pays for each item viewed (similar to pay
per view). Typically the subscriber has one day to view the
content [NOR200601].
Adaptation Field An optional variable-length extension field of the fixed-
length TS packet header, intended to convey clock
references and timing and synchronization informa-
tion as well as stuffing over an MPEG-2 multiplex
[CLA200301].
ADSL (Full-Rate Access technology that offers differing upload and down-
Asymmetric DSL) load speeds and can be configured to deliver up to six
megabits of data per second (6000 kbps) from the net-
work to the customer. ADSL enables voice and high-
speed data to be sent simultaneously over the existing
telephone line. This type of DSL is the most predominant
in commercial use for business and residential customers
around the world. Good for general Internet access and
for applications where downstream speed is most impor-
tant, such as video on demand. ITU-T recommendation
G.992.1 and ANSI standard T1.413-1998 specify full-
rate ADSL. ITU recommendation G.992.3 specifies
ADSL2, which provides advanced diagnostics, power
saving functions, PSD shaping, and better performance
than G.992.1. ITU recommendation G.992.5 specifies
ADSL2Plus, which provides the benefits of ADSL2Plus
twice the bandwidth so that bit rates as high as 20 Mbps
downstream can be achieved on relatively short lines
[DSL200701].
AES (Advanced Cryptographic algorithm; NIST-approved standard. The
Encryption Standard) current AES is Rijndael. It was chosen by NIST because
it is considered to be both faster and smaller than its
competitors. See also DES and 3DES [CON200701].
AFC Adaptation Field Control.
Asymmetric Algorithm Same as public key algorithm.
Asymmetric Encryption Type of encryption in which encryption keys are different
from decryption keys, and one key is computationally
difficult to determine from the other. Uses an asymmetric
algorithm [CON200701].
ATSC (Advanced A set of framework and associated standards for the trans-
Television mission of video, audio, and data, using the ISO MPEG-2
Systems Committee) standard [CLA200301].
Authentication The process of proving the genuineness of an entity (such as
a smart card) by means of a cryptographic procedure. Put
simply, authentication amounts to using a fixed procedure
GLOSSARY 321
Incoming Interface In PIM SM, the iif of a multicast route entry indicates the
(iif) interface from which multicast data packets are accepted
for forwarding. The iif is initialized when the entry is
created [RFC2362].
Internet Group The protocol used by IP Version 4 (IPv4) hosts to commu-
Management nicate multicast group membership states to multicast
Protocol (IGMP) routers. IGMP is used to dynamically register individual
hosts/receivers on a particular local subnet to a multicast
group. IGMPv1 defined the basic mechanism. It supports
a Membership Query (MQ) message and Membership
Report (MR) message. Most implementations at press time
employed IGMPv2; Version 2 adds Leave Group (LG)
messages. Version 3 adds source awareness allowing the
inclusion or exclusion of sources. IGMP allows group
membership lists to be dynamically maintained. The host
(user) sends an IGMP “report,” or join, to the router to be
included in the group. Periodically, the router sends a
“query” to learn which hosts (users) are still part of a
group. If a host wishes to continue its group membership,
it responds to the query with a report. If the host does not
send a report, the router prunes the group list to delete
this host; this eliminates unnecessary network transmis-
sions. With IGMPv2, a host may send a Leave Group
message to alert the router that it is no longer participating
in a multicast group; this allows the router to prune the
group list to delete this host before the next query is
scheduled, thereby minimizing the time period during
which unneeded transmissions are forwarded to the
network.
IPTV (IP-Based TV) Approaches, technologies, or protocols to deliver commer-
cial-grade Standard-Definition (SD) and High-Definition
(HD) entertainment-quality real-time linear and on-
demand video content over IP-based networks, while
meeting all prerequisite quality of service, quality of
experience, conditional access (security), blackout
management (for sporting events), emergency alert system,
closed captions, parental controls, Nielsen rating collection,
secondary audio channel, picture-in-picture, and guide data
requirements of the content providers and/or regulatory
entities. Typically, IPTV makes use of Moving Pictures
Expert Group 4 (MPEG-4) encoding to deliver 200–300 SD
channels and 20–40 HD channels; viewers need to be able to
switch channels within 2 s or less; also, the need exists to
support multiset-top boxes/multiprogramming (say 2–4)
330 GLOSSARY
addressesarenotforwardedbyarouter;theyremainlocalona
particular LAN segment [they have a Time-to-Live (TTL)
parameterset to1;eveniftheTTLisdifferentfrom1,theystill
are not forwarded by the router].
MAC Medium Access and Control of the Ethernet IEEE 802
standard of protocols [CLA200301].
MAC Header The link-layer header of the IEEE 802.3 standard or
Ethernet v2. It consists of a 6B destination address, 6B
source address, and 2B type field (see also NPA)
[FAI200501].
Member In PIM SM it is the host that is to receive multicast
transmissions. The protocol documentation also refers to
a member as a “receiver” [ROD200701].
MPE A scheme that encapsulates PDUs, forming a DSM-CC
(Multiprotocol table section. Each section is sent in a series of TS packets
Encapsulation) using a single TS logical channel [FAI200501].
MPEG-2 (Motion A set of multiplexing/encoding standards specified by the
Picture Experts Motion Picture Experts Group (MPEG) and standardized
Group–2) by the International Organization for Standardization (ISO/
IEC 113818-1), and ITU-T (H.220). Both MPG-2 and
MPEG-4 are important for IPTV, but the recent trend is
in favor of MPEG-4.
MPEG-7 An ISO/IEC standard for description and search of audio
and visual content.
Multicast Address An identifier for a group of nodes. An IP multicast address
or group address, as defined in “Host Extensions for IP
Multicasting,” STD 5, RFC 1112, August 1989, and in “IP
Version 6 Addressing Architecture,” RFC 2373, July
1998. The Internet Assigned Numbers Authority (IANA)
controls the assignment of IP multicast addresses.
IANA has allocated what has been known as the
Class D address space to be utilized for IP multicast. IP
multicast group addresses are in the range 224.0.0.0–
239.255.255.255.
Multicast Address A protocol defined in RFC 2730 that allows hosts to request
Dynamic Client multicast addresses from multicast address allocation
Allocation servers. This protocol is part of the IETF multicast address
Protocol (MADCAP) allocation architecture.
Multicast Protocol defined in RFC 2909 that can be used for inter-
Address Set domain multicast address set allocation. MASC is used by
Claim Protocol a node (typically a router) to claim and allocate one or
(MASC) more address prefixes to that nodes domain. While a
domain does not necessarily need to allocate an address
332 GLOSSARY
PPT (Pay Per Time) The consumer pays for consuming a media file once within
a time limit.
PPV (Pay Per View) Services offered in a way that the consumer will pay for the
service on a PPV basis. PPV services can be offered either
as prebooked (P-PPV) or impulse (I-PPV).
Pragmatic General A reliable multicast transport protocol for applications that
Multicast (PGM) require ordered, duplicate-free multicast data delivery. The
protocol guarantees that a receiver in a multicast group
receives all data packets from direct transmissions or via
retransmissions of lost packets. PGM can detect unrecov-
erable data packet loss.
Premium VoD VoD where one pays an additional monthly fee for premium
content such as recent movies.
Private Key Decryption key is often called private key in public-key
systems. A private key is also used for signing a message.
Private Section A syntactic structure used for mapping all service informa-
tion (e.g., an SI table) into TS packets. A table may be
divided into a number of sections. All sections of a table must
be carried over a single TS (Transport Stream) logical
channel [CLA200301]. A structure constructed in accor-
dance with Table 2-30 of ISO-MPEG-2. The structure may
be used to identify private information (i.e., not defined by
ISO-MPEG-2) relating to one or more elementary streams,
or a specific MPEG-2 program, or the entire TS. Other
standards bodies, for example, ETSI and ATSC, have defined
sets of table structures using the private-section structure. A
private section is transmitted as a sequence of TS packets
using a TS logical channel. A TS logical channel may carry
sections from more than one set of tables [FAI200501].
Product Metadata Metadata related to a media file, including product ID,
category, protecting services, access modes, usage rights,
pricing info, scheduling info, maturity rating, and addres-
sing [CON200701].
Program Television program or multimedia streams.
Program Stream A PES packet multiplex that carries several elementary
streams that were encoded using the same master clock or
system time clock.
Protocol Independent A protocol that provides intradomain multicast forwarding
Multicast (PIM) for all underlying unicast routing protocols [e.g., Open
Shortest Path First (OSPF) or Border Gateway Protocol
(BGP)], independent from the intrinsic unicast protocol.
Two modes exist: PIM Sparse Mode (PIM SM) and PIM
Dense Mode (PIM DM).
338 GLOSSARY
Prune List In PIM SM, the prune list is the second list of addresses that
is included in a Join/Prune message. It indicates those
sources or RPs from which downstream receiver(s) wish
to prune [RFC2362].
PSI (Program- PSI is used to convey information about services carried in a
Specific Information) TS multiplex.It is carried in oneoffour specifically identified
table section constructs; see also SI Table [FAI200501].
Public Key Encryption key is often called public key in public-key
systems. A public key can also be used for verification of
signatures [CON200701].
Public-Key Algorithm An algorithm where the key used for encryption is different
from the key used for decryption. Furthermore, the private
(decryption) key cannot be calculated from the public
(encryption) key [CON200701].
Pull VoD VoD system that stores content within the operator network.
Upon request, the content is streamed to the subscriber. The
advantage is that the user can select from a large, centrally
stored content library. The disadvantage is that bandwidth
must be allocated to each subscriber viewing VoD content
[NOR200601].
Push VOD (aka virtual VOD) A system where movies are broadcast in
encrypted format and stored directly on hard disks in the
STBs. A consumer can later purchase access to the movies
[CON200701].
Push VoD VoD system that automatically downloads the VoD content to
the subscribers DVR. This download is done during off-peak
times or at low priority, eliminating the need for additional
bandwidth. The downside is that this makes some of the DVR
disk unavailable to the subscriber. As such, this approach is
practical only for the latest content, which will beviewed by a
relatively large number of subscribers [NOR200601].
PUSI Payload_Unit_Start_Indicator of MPEG-2. A PUSI value
of zero indicates that the TS packet does not carry the start
of a new payload. The TS packet does carry the start of a
new payload [CLA200301].
PVR (Personal DVR and PDR are used interchangeably with this term.
Video Recorder)
QAM (Quadrature Modulation technique for cable broadcasting.
Amplitude Modulation)
QPSK (Quaternary Modulation technique for satellite broadcasting.
Phase Shift Keying)
Querier (also known as IGMP querier) The sender of a query
message—the querier is a multicast router. A multicast
GLOSSARY 339
REFERENCES
[CIS200701] Cisco Systems, Internet Protocol (IP) Multicast Technology Overview, Cisco
Systems, San Jose, CA.
[CON200701] Conax AS, Glossary of Terms, Oslo, Norway.
[DSL200701] DSL Forum, Fremont, CA, http://www.dslforum.org.
[DVB200701] DVB Organization, Standards, http://www.dvb.org.
[FAI200101] G. Fairhurst, MPEG-2 Digital Video, Background to Digital Video, University
of Aberdeen, Kings College, Dept. of Engineering, Aberdeen, UK, January 2001, http://www.
erg.abdn.ac.uk/research/future-net/digital-video/mpeg2-trans.html.
[FAI200501] G. Fairhurst, M-J. Montpetit, Address Resolution for IP Datagrams over MPEG-2
Networks, Internet Draft, draft-ietf-ipdvb-ar-00.txt, June 2005.
[NOR200601] Nortel, Position Paper: Introduction to IPTV, Triangle Park, NC, 2006.
[RFC1075] RFC 1075, Distance Vector Multicast Routing Protocol, D. Waitzman, C. Partridge,
S. Deering, November 1988.
[RFC1584] RFC 1584, Multicast Extensions to OSPF, J. Moy, March 1994.
[RFC2189] RFC 2189, Core-Based Trees (CBT Version 2) Multicast Routing—Protocol
Specification, A. Ballardie, September 1997.
[RFC2201] RFC 2201, Core-Based Trees (CBT) Multicast Routing Architecture, A. Ballardie,
September 1997.
[RFC2362] RFC 2362, Protocol Independent Multicast Sparse-Mode (PIM-SM): Protocol
Specification, D. Estrin, D. Farinacci, et al., June 1998.
[RFC2730] RFC 2730, Multicast Address Dynamic Client Allocation Protocol (MADCAP),
S. Hanna, B. Patel, M. Shah, December 1999.
[RFC2909] RFC 2909, The Multicast Address-Set Claim (MASC) Protocol, P. Radoslavov,
D. Estrin, et al., September 2000.
[RFC3810] RFC 3810, Multicast Listener Discovery Version 2 (MLDv2) for IPv6, R. Vida,
L. Costa, Editors, June 2004.
[RFC3973] RFC 3973, Protocol Independent Multicast Dense Mode (PIM DM): Protocol
Specification (Revised), A. Adams, A. Nicholas, W. Siadak, January 2005.
[RFC4541] RFC 4541, Considerations for Internet Group Management Protocol (IGMP) and
Multicast Listener Discovery (MLD) Snooping Switches. M. Christensen, K. Kimball,
F. Solensky, May 2006 (status: informational).
[RFC4604] RFC4604, Using Internet Group Management Protocol Version 3 (IGMPv3) and
Multicast Listener Discovery Protocol Version 2 (MLDv2) for Source-Specific Multicast,
H. Holbrook, B. Cain, B. Haberman, August 2006.
[ROD200701] M. Rodbell, Protocol Independent Multicast Sparse Mode, CMP COMMs
Design, an EE Times Community, June 3, 2007, http://www.commsdesign.com/main/9811/
9811standards.htm.
[WEL200101] P. J. Welcher, The Protocols of IP Multicast, White Paper, Chesapeake
NetCraftsmen, Arnold, MD.
INDEX
349
350 INDEX
Slice 287
SAR 60 Slice Resynchronization 293
Satellite TV 250 SM 6, 8, 78
Satellite-Based Single-Source IPTV SM Protocols 7, 8, 78
System 241 SMPTE VC-1 247
Scalability 126 SMS 253
Scope in IPv6 203 SNAP 298
SD 2, 234, 271 Snooping of IGMP queries 49
SDI 13, 244 Source Addresses 58
SDI Application 270 Source-Based Delivery Tree Optimization 84
SDI Rates 270 Source Filtering 232
SDSL 15 Source Registration 84
Second-Generation VDSL (VDSL2) 14 Source Trees 42
Segmentation and Reassembly (SAR) 60 Sparse Mode (PIM SM) 8, 78
Segmentation and Reassembly over “Existing” Sparse Mode (SM) 6, 8, 78
MPEG-2 Infrastructure 246 Specific MQs 61
Sending Asserts 104 SPT 42
Sending Candidate-RP- SSM 10
Advertisements 106 Standard Definition (SD) 2, 234, 271
Sending ECHO_Requests 136 State Change Report 229
Sending Hello 133 State Diagram for a Router in Nonquerier
Sending Hello Messages 159 State 69, 70
Sending Hellos 94 “Stateful” Configuration 206
Sending JOIN_ACKs 135 “Stateless” Autoconfiguration 206
Sending Join/Prune Messages 94 State-Machine-Specific Timers 162
Sending Join_Requests 134 State Refresh 170
Sending Quit_Notifications 135 States for Origination (S,G) State
Sending Registers and Receiving Machine 174
Register-Stops 100 State Transition Diagram for a Router in
Sequence of Frames 290 Nonquerier State 227
Serial Digital Interface (SDI) 13, 244, 269 State Transition Diagram for a Router in
Service Key 256 Querier State 226
Service Management System (SMS) 253 State Transitions 224
Set-Top Box (STB) 18, 241, 257 Statically Assigned Link–Local Scope 28
SFN 309 STB 18, 241, 257
S,G 42, 80, 88 Steady State Maintenance of Distribution
(S,G) Assert Message State Machine 177 Tree 89
S,G State 157 SubNetwork Attachment Point
Shared Distribution Tree 45 (SNAP) 298
Shared Trees 42, 43, 45 Subscription VoD 239
SHDSL 15 Switches Using IGMP Snooping 62
Shortest Path Tree (SPT) 42 Symmetric DSL (SDSL) 15
Show ip igmp groups 76
Show ip mroute 76 TCP 6
Simplified Protocol Hierarchy 271 Telco TV 235
SimulCrypt 255, 256 Telcos 243
Single-Frequency Network (SFN) 309 Timers Related to Tree Maintenance 108
Single-Program Transport Stream Timers Relating to Neighbor
(SPTS) 272, 278 Discovery 111
INDEX 357