VXLAN SVI suspended as soon as VLAN becomes part of vn-segment
When configuring VXLAN with MP-BGP EVPN Control Plane I’ve come across interesting behavior. When assigning VLAN to VXLAN segment (vn-segment) my SVI for that VLAN was automatically suspended with the following error:
%FWM-3-NVE_LIBINIT: NVE library not initialized by client in nve_vni_export_get_data
N5K2(config-if)# vlan 10
N5K2(config-vlan)# vn-segment 10100
N5K2(config-vlan)# exit
N5K2(config)# 2018 Oct 20 21:40:29 N5K1 %FWM-3-NVE_LIBINIT: NVE library not initialized by client in nve_vni_export_get_data
N5K2(config)# sh int vlan 10
Vlan10 is down (suspended), line protocol is down
Even after following proper the proper order of operations SVI would still become suspended after enabling vn-segment.
- vlan # ; vn-segment #
- interface nve 1 ; member vni # ; mcast-group #
- interface #
So I started troubleshooting from the down up. If you are certain that all of the below are correct scroll down all the way for the remediation.
- Are all the features enabled on the Spines and Leafs?
- Is the unicast routing working (UNDERLAY) and can you reach your neighbors?
- Is the multicast routing working and can you reach igmp mcast-group members?
- Is the VNI up for the particular mcast-group and are you learning MACs via VTEPs?
- Is the control plane (MP-BGP) working and are the routes being learned?
Are all the features enabled on the Spines and Leafs?
Make sure that all the features are defined (use either OSPF or ISIS for underlay).
install feature-set fabric feature-set fabric feature fabric forwarding feature interface-vlan feature ospf !feature isis feature nv overlay feature bgp feature vn-segment-vlan-based nv overlay evpn
Is the unicast routing working (UNDERLAY/OSPF) and can you reach your neighbors?
If this is where it fails, verify your underlay routing. Example below from VTEP2/Leaf102 (N5K2 .52) perspective to VTEP1 (N5K1 .51) and both Spines (.71&.72).
N5K2(config-if)# sh ip route ospf-UNDERLAY IP Route Table for VRF "default" '*' denotes best ucast next-hop '**' denotes best mcast next-hop '[x/y]' denotes [preference/metric] '%<string>' in via output denotes VRF <string> 10.0.0.51/32, ubest/mbest: 2/0 *via 10.0.0.71, Eth1/5, [110/9], 17:14:52, ospf-UNDERLAY, intra *via 10.0.0.72, Eth1/6, [110/9], 17:14:52, ospf-UNDERLAY, intra 10.0.0.71/32, ubest/mbest: 1/0 *via 10.0.0.71, Eth1/5, [110/5], 21:23:18, ospf-UNDERLAY, intra via 10.0.0.71, Eth1/5, [250/0], 21:23:26, am 10.0.0.72/32, ubest/mbest: 1/0 *via 10.0.0.72, Eth1/6, [110/5], 21:23:20, ospf-UNDERLAY, intra via 10.0.0.72, Eth1/6, [250/0], 21:23:26, am 10.0.0.254/32, ubest/mbest: 2/0 *via 10.0.0.71, Eth1/5, [110/5], 21:23:18, ospf-UNDERLAY, intra *via 10.0.0.72, Eth1/6, [110/5], 21:23:18, ospf-UNDERLAY, intra 10.1.2.51/32, ubest/mbest: 2/0 *via 10.0.0.71, Eth1/5, [110/9], 17:14:52, ospf-UNDERLAY, intra *via 10.0.0.72, Eth1/6, [110/9], 17:14:52, ospf-UNDERLAY, intra 10.1.2.71/32, ubest/mbest: 1/0 *via 10.0.0.71, Eth1/5, [110/5], 17:18:20, ospf-UNDERLAY, intra 10.1.2.72/32, ubest/mbest: 1/0 *via 10.0.0.72, Eth1/6, [110/5], 17:18:20, ospf-UNDERLAY, intra N5K2(config-if)# ping 10.0.0.51 PING 10.0.0.51 (10.0.0.51): 56 data bytes 64 bytes from 10.0.0.51: icmp_seq=0 ttl=253 time=1.129 ms N5K2(config-if)# ping 10.0.0.71 PING 10.0.0.71 (10.0.0.71): 56 data bytes 64 bytes from 10.0.0.71: icmp_seq=0 ttl=254 time=1.076 ms N5K2(config-if)# ping 10.0.0.72 PING 10.0.0.72 (10.0.0.72): 56 data bytes 64 bytes from 10.0.0.72: icmp_seq=0 ttl=254 time=1.169 ms
Is the multicast routing working and can you reach igmp mcast-group members?
If this is where it fails verify you have RP defined on the Spine (static, auto-rp or anycast) and that your VTEPs have the RP assigned. Example below from elected RP (N7K2) and VTEP2 (N5K2)
N7K2(config-if-range)# sh run pim feature pim ip pim send-rp-announce loopback0 group-list 224.0.0.0/4 bidir ip pim send-rp-discovery loopback0 ip pim ssm range 232.0.0.0/8 ip pim bfd
N5K2(config-if)# sh ip pim rp PIM RP Status Information for VRF "default" BSR disabled Auto-RP listen-only mode Auto-RP RPA: 10.0.0.72, uptime: 21:26:47, expires: 00:02:55 BSR RP Candidate policy: None BSR RP policy: None Auto-RP Announce policy: None Auto-RP Discovery policy: None RP: 10.0.0.72, (1), uptime: 21:26:47 priority: 0, RP-source: 10.0.0.72 (A), group ranges: 224.0.0.0/4 (bidir) , expires: 00:02:55 (A)
N5K2(config-if)# sh ip mroute IP Multicast Routing Table for VRF "default" (*, 224.0.0.0/4), bidir, uptime: 21:26:50, pim ip Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72 Outgoing interface list: (count: 1) Ethernet1/6, uptime: 21:26:50, pim, (RPF) (*, 231.1.1.1/32), bidir, uptime: 21:26:46, igmp ip pim Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72 Outgoing interface list: (count: 2) Ethernet1/6, uptime: 21:26:46, pim, (RPF) loopback0, uptime: 21:26:46, igmp (*, 232.0.0.0/8), uptime: 21:26:50, pim ip Incoming interface: Null, RPF nbr: 0.0.0.0 Outgoing interface list: (count: 0) (*, 239.1.1.0/32), bidir, uptime: 21:26:50, ip pim nve Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72 Outgoing interface list: (count: 2) nve1, uptime: 17:40:46, nve Ethernet1/6, uptime: 21:26:50, pim, (RPF) --- 231.1.1.1 ping multicast statistics --- 5 packets transmitted, 0 packets received, 100% packet loss N5K2(config-if)# ping multicast 231.1.1.1 interface e1/6 PING 231.1.1.1 (231.1.1.1): 56 data bytes 64 bytes from 10.0.0.51: icmp_seq=0 ttl=253 time=1.108 ms --- 231.1.1.1 ping multicast statistics --- 5 packets transmitted, From member 10.0.0.51: 5 packets received, 0.00% packet loss --- in total, 1 group member responded ---
Is the VNI up for the particular mcast-group and are you learning MACs via VTEPs?
If this is where it fails you may have misconfigured nve interface or you don’t have correct VXLAN mapping (vn-segment). NVE for specific VNI’s should be up either L2VNI or L3VNI. Example below from VTEP2
N5K2(config-if)# sh run vlan 10 vlan 10 vn-segment 10100 N5K2(config-if)# sh run int nve 1 interface nve1 no shutdown source-interface loopback1 host-reachability protocol bgp member vni 5000 associate-vrf member vni 10100 suppress-arp mcast-group 239.1.1.0 member vni 10200 suppress-arp mcast-group 239.1.1.0 N5K2(config-router)# sh nve vni Codes: CP - Control Plane DP - Data Plane UC - Unconfigured SA - Suppress ARP SU - Suppress Unknown Unicast Interface VNI Multicast-group State Mode Type [BD/VRF] Flags --------- -------- ----------------- ----- ---- ------------------ ----- nve1 5000 n/a Up CP L3 [transit-l3-vni] nve1 10100 239.1.1.0 Up CP L2 [10] SA nve1 10200 239.1.1.0 Up CP L2 [20] SA
N5K2(config-if)# sh mac address-table vlan 10
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 10 001b.2188.8075 dynamic 50 F F Eth1/2
* 10 001b.218d.3d98 dynamic 0 F F nve1/10.1.2.51
* 10 2020.0000.00aa static 0 F F sup-eth2
Is the control plane (MP-BGP) working and are the routes being learned?
If your BGP is not up and you are not learning VTEPS addresses (Prefixes) you need verify your configuration for BGP EVPN.
N5K2(config-if)# sh bgp l2vpn evpn summary BGP summary information for VRF default, address family L2VPN EVPN BGP router identifier 10.0.0.52, local AS number 65535 BGP table version is 168, L2VPN EVPN config peers 2, capable peers 2 7 network entries and 9 paths using 928 bytes of memory BGP attribute entries [7/1008], BGP AS path entries [0/0] BGP community entries [0/0], BGP clusterlist entries [2/8] Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 10.1.2.71 4 65535 1121 1118 168 0 0 18:21:22 2 10.1.2.72 4 65535 1126 1122 168 0 0 18:21:12 2
N5K2(config-if)# sh bgp l2vpn evpn BGP routing table information for VRF default, address family L2VPN EVPN BGP table version is 168, local router ID is 10.0.0.52 Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 10.0.0.51:32777 * i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[0]:[0.0.0.0]/216 10.1.2.51 100 0 i *>i 10.1.2.51 100 0 i *>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 10.1.2.51 100 0 i * i 10.1.2.51 100 0 i Route Distinguisher: 10.0.0.52:32777 (L2VNI 10100) *>l[2]:[0]:[0]:[48]:[001b.2188.8075]:[0]:[0.0.0.0]/216 10.1.2.52 100 32768 i *>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[0]:[0.0.0.0]/216 10.1.2.51 100 0 i *>l[2]:[0]:[0]:[48]:[001b.2188.8075]:[32]:[10.0.1.12]/272 10.1.2.52 100 32768 i *>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 10.1.2.51 100 0 i N5K2(config-if)# sh bgp l2vpn evpn 10.0.1.11 Route Distinguisher: 10.0.0.51:32777 BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 166 Paths: (2 available, best #1) Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW, , is locked Advertised path-id 1 Path type: internal, path is valid, is best path AS-Path: NONE, path sourced internal to AS 10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71) Origin IGP, MED not set, localpref 100, weight 0 Received label 10100 5000 Extcommunity: RT:65535:5000 RT:65535:10100 ENCAP:8 Router MAC:00de.fb12.1a7c Originator: 10.0.0.51 Cluster list: 10.1.2.71 Path type: internal, path is valid, not best reason: Neighbor Address AS-Path: NONE, path sourced internal to AS 10.1.2.51 (metric 9) from 10.1.2.72 (10.1.2.72) Origin IGP, MED not set, localpref 100, weight 0 Received label 10100 5000 Extcommunity: RT:65535:5000 RT:65535:10100 ENCAP:8 Router MAC:00de.fb12.1a7c Originator: 10.0.0.51 Cluster list: 10.1.2.72 Path-id 1 not advertised to any peer Route Distinguisher: 10.0.0.52:32777 (L2VNI 10100) BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 165 Paths: (1 available, best #1) Flags: (0x00021a) on xmit-list, is in l2rib/evpn, is not in HW, Advertised path-id 1 Path type: internal, path is valid, is best path Imported from 10.0.0.51:32777:[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 AS-Path: NONE, path sourced internal to AS 10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71) Origin IGP, MED not set, localpref 100, weight 0 Received label 10100 5000 Extcommunity: RT:65535:5000 RT:65535:10100 ENCAP:8 Router MAC:00de.fb12.1a7c Originator: 10.0.0.51 Cluster list: 10.1.2.71 Path-id 1 not advertised to any peer
If all from the above checks out please consider this:
- Per Cisco support you should be using separate loopback address for VTEP/Unicast Underlay/BGP. Otherwise this is no supported configuration. That being said try to reconfigure your BGP peering with different loopback address. Even though I haven’t seen issues with using i.e lo0 for all.
- If the anycast gateway feature is enabled for a specific VNI, then the anyway gateway feature must be enabled on all VTEPs that have that VNI configured. Having the anycast gateway feature configured on only some of the VTEPs enabled for a specific VNI is not supported.
- In plain text if you are tend to use fabric forwarding anycast gateway for seamless VM mobility across your VTEPS your SVI needs to be the part of it – otherwise remove that global command In example if you want to vMotion your VM from one VTEP to another fabric forwarding anycast will make that transfer painless when underline VM last resort MAC address will not change and will be able to forward traffic without any interruptions. This is a good feature.
Reason why SVI was being suspended as soon as it becomes vn-segment was due to having fabric forwarding anycast enabled where SVI was not part of it.
N5K2(config-if)# sh run | in anycast fabric forwarding anycast-gateway-mac 2020.0000.00AA fabric forwarding mode anycast-gateway N5K2(config-if)# sh run int vlan 10 interface Vlan10 no shutdown vrf member transit-l3-vni ip address 10.0.1.52/24
As soon as I enabled fabric forwarding under the SVI the suspended tag went away.
N5K2(config-if)# sh int vlan 10 Vlan10 is down (suspended), line protocol is down N5K2(config-if)# int vlan 10 N5K2(config-if)# fabric forwarding mode anycast-gateway N5K2(config-if)# sh int vlan 10 Vlan10 is up, line protocol is up
I have also noticed that L3VNI kicks in right away and its added to route distinguisher.
N5K2(config-if)# sh bgp l2vpn evpn BGP routing table information for VRF default, address family L2VPN EVPN BGP table version is 170, local router ID is 10.0.0.52 Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup Route Distinguisher: 10.0.0.52:3 (L3VNI 5000) *>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 10.1.2.51 100 0 i
N5K2(config-if)# sh bgp l2vpn evpn 10.0.1.11 Route Distinguisher: 10.0.0.52:3 (L3VNI 5000) BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 164 Paths: (1 available, best #1) Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW, Advertised path-id 1 Path type: internal, path is valid, is best path Imported from 10.0.0.51:32777:[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 AS-Path: NONE, path sourced internal to AS 10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71) Origin IGP, MED not set, localpref 100, weight 0 Received label 10100 5000 Extcommunity: RT:65535:5000 RT:65535:10100 ENCAP:8 Router MAC:00de.fb12.1a7c Originator: 10.0.0.51 Cluster list: 10.1.2.71 Path-id 1 not advertised to any peer
Bottom line is if you want to use anycast for your VTEPs make sure SVIs are part of it as well. Otherwise, don’t install that global configuration. Big thanks to MM from CCIE DC Study Group for helping me nail this down.
References:
- https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/vxlan/configuration/guide/b_Cisco_Nexus_9000_Series_NX-OS_VXLAN_Configuration_Guide_7x/b_Cisco_Nexus_9000_Series_NX-OS_VXLAN_Configuration_Guide_7x_chapter_0100.html
- https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/200952-Configuration-and-Verification-VXLAN-wit.html
- https://www.cisco.com/c/en/us/td/docs/switches/datacenter/pf/configuration/guide/b-pf-configuration/b-pf-configuration_chapter_01111.html#concept_B54E8C5F82C14C7A9733639B1A560A01