Multicast Traffic in Vxlan Using Underlay: Example
Multicast Traffic in Vxlan Using Underlay: Example
Multicast Traffic in Vxlan Using Underlay: Example
eos.arista.com/eos-4-20-5f/multicast-in-vxlan-using-underlay
By Saravanan Balasubramanian
Contents [hide]
Example
Source Route (Multicast Host Route)
RPF Lookup
MLAG Example
Platform compatibility
Configuration
Troubleshooting
Multi-Agent Protocol Model
Ribd Protocol Model
Limitations
General
Unicast Host Route
Resources
Example
The rest of the document will use the following topology and configuration
to explain how the feature works. In the example network, the source is at
VTEP A and the receivers are at VTEP B. Notice one receiver is in the same
subnet as the source, while the other receiver is in a different subnet.
1. VTEP B receives an IGMP V2 join on VLAN 20 and VLAN 30, the IGMP
reports will remain local. Note: Prior to 4.20.5F, the IGMP reports
would have been Vxlan encapsulated and sent on the overlay to VTEP
1/10
A.
2. If VTEP B has PIM enabled and RP configured, VTEP B sends a (*,G)
join on the underlay towards the RP. In the case of IGMP V3 source
join, a PIM S,G join is sent towards the source only after VTEP B learns
of the unicast route to the source.
1. VTEP A creates an S,G route with the incoming interface (IIF) as VLAN
10.
2. VTEP A advertises source route ( /32 host route ) into the underlay so
that everyone in the underlay is aware of the source.
3. The traffic is not Vxlan encapsulated and sent on the underlay. At this
point, PIM operates in its usual way. PIM registers with the RP. RP
forwards the multicast data down the RP tree towards VTEP B. RP
learns the source route and switches to SPT.
4. On VTEP B, the S,G route switches to SPT when it learns the source
route.
Multicast traffic sent by the source does not do ARP resolution. So PIM
internally forces an ARP resolution for the source. This helps PIM
determine if the source is on the local VTEP or remote VTEP. The source
route is injected only by the local VTEP. After injecting the source route, the
routes have to be redistributed. Currently, only BGP supports redistribute
attached-host for the URIB and the MRIB. A source route is withdrawn
when the S,G route is deleted or one of the above condition fails. If several
S,G routes exists for the source, all the S,G routes have to age out or be
deleted for the source route to be withdrawn.
RPF Lookup
When PIM does a RPF lookup, it first looks up the MRIB. If no route is
found, PIM falls back on the URIB. In the case of a first hop router where
the source is directly connected, the MRIB will not have any source routes.
On VTEP A, the RPF will be Vlan 10 where the multicast data traffic was
seen. In the multi-agent protocol mode, all the source routes are injected
into the MRIB, and therefore advertised and stored in the peer’s MRIB. On
all the PIM routers in the underlay after receiving the BGP update
containing the source advertisement, PIM will find the source route in the
MRIB. This will allow PIM to send joins to the correct VTEP. On the last hop
router (VTEP B), after receiving the BGP update, the MRIB will have the
source route while the URIB will have the directly connected route. Since
PIM first looks up the MRIB, the last hop router will know that the source is
not local. In the ribd protocol mode, all the source routes are injected into
the URIB. BGP advertises the source routes and stores them in the peer’s
URIB.
MLAG Example
In a Mlag scenario, PIM agents run on both Mlag peers and work as
independent routers. PIM Hellos resolve DR-ship, while PIM asserts
maintain correct forwarding state. A Vxlan VLAN bridges PIM Hellos
between peers locally but will not be sent on the overlay. Each SVI on each
peer of a Mlag needs to have a unique address so neighborship can be
established. Multicast does not work with ip address virtual because the
virtual address performs source NAT on all packets originating with the
virtual address. This causes PIM and IGMP control packets to be dropped
by the kernel. Instead use ip virtual-router address. Because the Pim
3/10
Hellos are not sent on the overlay, the set of addresses used for an SVI can
be repeated in each VTEP. In the example below, 10.1.1.1 and 10.1.1.3
configured on VTEP A are reused on the same subnet for VTEP B.
Platform compatibility
DCS-7050QX, DCS-7050SX, DCS-7050TX support in 4.20.5.1F
DCS-7060CX, DCS-7060CX2, DCS-7060SX2 support in 4.20.5.1F
DCS-7260CX, DCS-7260CX3, DCS-7260QX support in 4.20.5.1F
DCS-7500 and DCS-7280 series support in 4.20.5F
DCS-7300X series support in 4.20.5.1F.
DCS-7320X series support in 4.20.5.1F.
Configuration
To inject a source route, configure ip multicast source route export on
the incoming interface.
Arista(config)#interface Vlan10
Arista(config-Vl10)#ip pim sparse-mode
Arista(config-Vl10)#ip multicast source route export
To redistribute the source routes in the MRIB via BGP while running multi-
agent protocol model, configure redistribute attached-host for the IPv4
multicast address-family. Activate the neighbor to establish a BGP
connection.
4/10
Arista(config-router-bgp)#address-family ipv4 multicast
Arista(config-router-bgp-af)#neighbor 3.0.0.2 activate
Arista(config-router-bgp-af)#redistribute ?
attached-host Multicast source routes
connected Connected routes
isis IS-IS routes
static Static multicast routes
Arista(config-router-bgp-af)#redistribute attached-host
To redistribute the source routes in the URIB via BGP while running ribd
protocol model, configure redistribute attached-host under router bgp.
Arista(config-router-bgp)#redistribute attached-host
This is a sample configuration for a VTEP for the setup above using multi-
agent protocol model.
interface Loopback0
ip address 1.1.1.1/32
interface vxlan1
vxlan source-interface Loopback0
vxlan vlan10 vni 10000
interface vlan10
ip address 10.1.1.1/24
ip pim sparse-mode
ip multicast source route export
router bgp 10
router-id 0.0.0.2
This is a sample configuration for a VTEP for the setup above using ribd
protocol model.
5/10
service routing protocol model ribd
interface Loopback0
ip address 1.1.1.1/32
interface vxlan1
vxlan source-interface Loopback0
vxlan vlan10 vni 10000
interface vlan10
ip address 10.1.1.1/24
ip pim sparse-mode
ip multicast source route export
router bgp 10
router-id 0.0.0.2
redistribute attached-host
Troubleshooting
On the first-hop router, to verify the S,G has been created use show ip
mroute. The RPF should not be using the source route. Instead, the RPF
should use directly connected route to the source in the URIB.
As of 4.20.5.1F, an ARP entry should exist for each source. The MAC address
of the source should be learnt on a local port.
6/10
Arista#show arp 10.1.1.2
Address Age (min) Hardware Addr Interface
10.1.1.2 N/A 0012.0100.0001 Vlan10, Port-Channel1
On the underlay and last-hop router, verify that the S,G is using the source
route in MRIB with show ip mroute.
7/10
Arista#show ip mroute sparse-mode
PIM Sparse Mode Multicast Routing Table
Flags: E - Entry forwarding on the RPT, J - Joining to the SPT
R - RPT bit is set, S - SPT bit is set, L - Source is attached
W - Wildcard entry, X - External component interest
I - SG Include Join alert rcvd, P - (*,G) Programmed in hardware
H - Joining SPT due to policy, D - Joining SPT due to protocol
Z - Entry marked for deletion, C - Learned from a DR via a register
A - Learned via Anycast RP Router, M - Learned via MSDP
N - May notify MSDP, K - Keepalive timer not running
T - Switching Incoming Interface, B - Learned via Border Router
RPF route: U - From unicast routing table
M - From multicast routing table
225.1.1.1
10.1.1.2, 0:00:52, flags: SR
Incoming interface: Vlan10
RPF route: [M] 10.1.1.2/32 [200/0] via 3.0.0.1
Vlan20
On the underlay and last-hop router, the route should be received by BGP
using show bgp commands.
On the underlay and last-hop router, the route should be present in the
MRIB, which can be checked using show ip route multicast.
8/10
Arista#show ip bgp
BGP routing table information for VRF default
Router identifier 0.0.0.2, local AS number 10
Route status codes: s - suppressed, * - valid, > - active, # - not
installed, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup, L -
labeled-unicast
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL
Nexthop - Link Local Nexthop
Network Next Hop Metric LocPref
Weight Path
* > 3.0.0.0/24 3.0.0.2 0 100 0
20 i
* > 10.1.1.2/32 - 0 0 -
?
Limitations
General
1. This feature is only supported on default VRFs on both the underlay
and the overlay.
2. Topologies where PIM routes are connected to the edge of Vxlan
VLANs are not supported. This solution expects hosts to be
connected to the Vxlan VLANs.
3. With the current implementation, the VTEPs cannot be non-Arista
routers. In a future release, a solution for interop will be provided.
4. With the current implementation, the VTEPs have to be layer 3 with
PIM and IGMP running. In a future release, we will provide a solution
where a VTEP can be layer 2 and still manage to get the multicast
traffic through the underlay.
5. In the scenario where the source moves, there is a possibility for
traffic loss. For example, in the above topology, if the source moves
from VTEP A to VTEP B, the S,G route on VTEP A will never age out. On
VTEP B, the S,G route has IIF pointing towards the underlay and any
traffic seen on VLAN 20 would cause PIM to install a fastdrop. The
activity on the fastdrop route on VTEP B will keep the S,G route alive,
and VTEP B will continue to send joins towards the source. Since the
first hop router (VTEP A) is receiving joins, it prevents the S,G route
from aging out. VTEP A continues to advertise the source route. Only
when the source route is withdrawn can VTEP B create a route with
VLAN 20 as the new IIF.
6. In the scenario where the BGP update message for the source route
is processed after the multicast data traffic and the last hop router is
configured with ip multicast source route export, the last hop
router might assume it is the first hop and start advertising the
source route. Since both first and last hop routers are advertising the
source routes, the S,G joins might be sent to the last hop router
9/10
causing traffic loss.
Resources
1. ARP converted Host Routes injection into BGP
2. MP BGP for IPv4 Multicast
10/10