VXLAN SVI suspended as soon as VLAN becomes part of vn-segment

When configuring VXLAN with MP-BGP EVPN Control Plane I’ve come across interesting behavior. When assigning VLAN to VXLAN segment (vn-segment) my SVI for that VLAN was automatically suspended with the following error:

%FWM-3-NVE_LIBINIT: NVE library not initialized by client in nve_vni_export_get_data

N5K2(config-if)# vlan 10

N5K2(config-vlan)# vn-segment 10100

N5K2(config-vlan)# exit

N5K2(config)# 2018 Oct 20 21:40:29 N5K1 %FWM-3-NVE_LIBINIT: NVE library not initialized by client in nve_vni_export_get_data

N5K2(config)# sh int vlan 10

Vlan10 is down (suspended), line protocol is down

Even after following proper the proper order of operations SVI would still become suspended after enabling vn-segment.

  • vlan # ; vn-segment #
  • interface nve 1 ; member vni # ; mcast-group #
  • interface #

So I started troubleshooting from the down up. If you are certain that all of the below are correct scroll down all the way for the remediation.

  • Are all the features enabled on the Spines and Leafs?
  • Is the unicast routing working (UNDERLAY) and can you reach your neighbors?
  • Is the multicast routing working and can you reach igmp mcast-group members?
  • Is the VNI up for the particular mcast-group and are you learning MACs via VTEPs?
  • Is the control plane (MP-BGP) working and are the routes being learned?

Are all the features enabled on the Spines and Leafs?

Make sure that all the features are defined (use either OSPF or ISIS for underlay).

install feature-set fabric
feature-set fabric
feature fabric forwarding
feature interface-vlan
feature ospf
!feature isis
feature nv overlay
feature bgp
feature vn-segment-vlan-based
nv overlay evpn

Is the unicast routing working (UNDERLAY/OSPF) and can you reach your neighbors?

If this is where it fails, verify your underlay routing. Example below from VTEP2/Leaf102 (N5K2 .52) perspective to VTEP1 (N5K1 .51) and both Spines (.71&.72).

N5K2(config-if)# sh ip route ospf-UNDERLAY 
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

10.0.0.51/32, ubest/mbest: 2/0
*via 10.0.0.71, Eth1/5, [110/9], 17:14:52, ospf-UNDERLAY, intra
*via 10.0.0.72, Eth1/6, [110/9], 17:14:52, ospf-UNDERLAY, intra
10.0.0.71/32, ubest/mbest: 1/0
*via 10.0.0.71, Eth1/5, [110/5], 21:23:18, ospf-UNDERLAY, intra
via 10.0.0.71, Eth1/5, [250/0], 21:23:26, am
10.0.0.72/32, ubest/mbest: 1/0
*via 10.0.0.72, Eth1/6, [110/5], 21:23:20, ospf-UNDERLAY, intra
via 10.0.0.72, Eth1/6, [250/0], 21:23:26, am
10.0.0.254/32, ubest/mbest: 2/0
*via 10.0.0.71, Eth1/5, [110/5], 21:23:18, ospf-UNDERLAY, intra
*via 10.0.0.72, Eth1/6, [110/5], 21:23:18, ospf-UNDERLAY, intra
10.1.2.51/32, ubest/mbest: 2/0
*via 10.0.0.71, Eth1/5, [110/9], 17:14:52, ospf-UNDERLAY, intra
*via 10.0.0.72, Eth1/6, [110/9], 17:14:52, ospf-UNDERLAY, intra
10.1.2.71/32, ubest/mbest: 1/0
*via 10.0.0.71, Eth1/5, [110/5], 17:18:20, ospf-UNDERLAY, intra
10.1.2.72/32, ubest/mbest: 1/0
*via 10.0.0.72, Eth1/6, [110/5], 17:18:20, ospf-UNDERLAY, intra

N5K2(config-if)# ping 10.0.0.51
PING 10.0.0.51 (10.0.0.51): 56 data bytes
64 bytes from 10.0.0.51: icmp_seq=0 ttl=253 time=1.129 ms

N5K2(config-if)# ping 10.0.0.71
PING 10.0.0.71 (10.0.0.71): 56 data bytes
64 bytes from 10.0.0.71: icmp_seq=0 ttl=254 time=1.076 ms

N5K2(config-if)# ping 10.0.0.72
PING 10.0.0.72 (10.0.0.72): 56 data bytes
64 bytes from 10.0.0.72: icmp_seq=0 ttl=254 time=1.169 ms

Is the multicast routing working and can you reach igmp mcast-group members?

If this is where it fails verify you have RP defined on the Spine (static, auto-rp or anycast) and that your VTEPs have the RP assigned. Example below from elected RP (N7K2) and VTEP2 (N5K2)

N7K2(config-if-range)# sh run pim
feature pim

ip pim send-rp-announce loopback0 group-list 224.0.0.0/4 bidir
ip pim send-rp-discovery loopback0
ip pim ssm range 232.0.0.0/8
ip pim bfd

N5K2(config-if)# sh ip pim rp
PIM RP Status Information for VRF "default"
BSR disabled
Auto-RP listen-only mode
Auto-RP RPA: 10.0.0.72, uptime: 21:26:47, expires: 00:02:55
BSR RP Candidate policy: None
BSR RP policy: None
Auto-RP Announce policy: None
Auto-RP Discovery policy: None

RP: 10.0.0.72, (1), 
uptime: 21:26:47 priority: 0, 
RP-source: 10.0.0.72 (A), 
group ranges:
224.0.0.0/4 (bidir) , expires: 00:02:55 (A)
N5K2(config-if)# sh ip mroute 
IP Multicast Routing Table for VRF "default"

(*, 224.0.0.0/4), bidir, uptime: 21:26:50, pim ip 
Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72
Outgoing interface list: (count: 1)
Ethernet1/6, uptime: 21:26:50, pim, (RPF)

(*, 231.1.1.1/32), bidir, uptime: 21:26:46, igmp ip pim 
Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72
Outgoing interface list: (count: 2)
Ethernet1/6, uptime: 21:26:46, pim, (RPF)
loopback0, uptime: 21:26:46, igmp

(*, 232.0.0.0/8), uptime: 21:26:50, pim ip 
Incoming interface: Null, RPF nbr: 0.0.0.0
Outgoing interface list: (count: 0)

(*, 239.1.1.0/32), bidir, uptime: 21:26:50, ip pim nve 
Incoming interface: Ethernet1/6, RPF nbr: 10.0.0.72
Outgoing interface list: (count: 2)
nve1, uptime: 17:40:46, nve
Ethernet1/6, uptime: 21:26:50, pim, (RPF)

--- 231.1.1.1 ping multicast statistics ---
5 packets transmitted,
0 packets received, 100% packet loss
N5K2(config-if)# ping multicast 231.1.1.1 interface e1/6
PING 231.1.1.1 (231.1.1.1): 56 data bytes
64 bytes from 10.0.0.51: icmp_seq=0 ttl=253 time=1.108 ms

--- 231.1.1.1 ping multicast statistics ---
5 packets transmitted,
From member 10.0.0.51: 5 packets received, 0.00% packet loss
--- in total, 1 group member responded ---

Is the VNI up for the particular mcast-group and are you learning MACs via VTEPs?

If this is where it fails you may have misconfigured nve interface  or you don’t have correct VXLAN mapping (vn-segment). NVE for specific VNI’s should be up either L2VNI or L3VNI. Example below from VTEP2

N5K2(config-if)# sh run vlan 10

vlan 10
vn-segment 10100

N5K2(config-if)# sh run int nve 1

interface nve1
no shutdown
source-interface loopback1
host-reachability protocol bgp
member vni 5000 associate-vrf
member vni 10100
suppress-arp
mcast-group 239.1.1.0
member vni 10200
suppress-arp
mcast-group 239.1.1.0

N5K2(config-router)# sh nve vni
Codes: CP - Control Plane DP - Data Plane 
UC - Unconfigured SA - Suppress ARP 
SU - Suppress Unknown Unicast

Interface VNI Multicast-group State Mode Type [BD/VRF] Flags
--------- -------- ----------------- ----- ---- ------------------ -----
nve1 5000 n/a Up CP L3 [transit-l3-vni] 
nve1 10100 239.1.1.0 Up CP L2 [10] SA 
nve1 10200 239.1.1.0 Up CP L2 [20] SA

N5K2(config-if)# sh mac address-table vlan 10
Legend: 
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 10 001b.2188.8075 dynamic 50 F F Eth1/2
* 10 001b.218d.3d98 dynamic 0 F F nve1/10.1.2.51
* 10 2020.0000.00aa static 0 F F sup-eth2

Is the control plane (MP-BGP) working and are the routes being learned?

If your BGP is not up and you are not learning VTEPS addresses (Prefixes) you need verify your configuration for BGP EVPN.

N5K2(config-if)# sh bgp l2vpn evpn summary 
BGP summary information for VRF default, address family L2VPN EVPN
BGP router identifier 10.0.0.52, local AS number 65535
BGP table version is 168, L2VPN EVPN config peers 2, capable peers 2
7 network entries and 9 paths using 928 bytes of memory
BGP attribute entries [7/1008], BGP AS path entries [0/0]
BGP community entries [0/0], BGP clusterlist entries [2/8]

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.1.2.71 4 65535 1121 1118 168 0 0 18:21:22 2 
10.1.2.72 4 65535 1126 1122 168 0 0 18:21:12 2

N5K2(config-if)# sh bgp l2vpn evpn 
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 168, local router ID is 10.0.0.52
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 10.0.0.51:32777
* i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[0]:[0.0.0.0]/216
                      10.1.2.51                         100          0 i
*>i                   10.1.2.51                         100          0 i
*>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272
                      10.1.2.51                         100          0 i
* i                   10.1.2.51                         100          0 i

Route Distinguisher: 10.0.0.52:32777    (L2VNI 10100)
*>l[2]:[0]:[0]:[48]:[001b.2188.8075]:[0]:[0.0.0.0]/216
                      10.1.2.52                         100      32768 i
*>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[0]:[0.0.0.0]/216
                      10.1.2.51                         100          0 i
*>l[2]:[0]:[0]:[48]:[001b.2188.8075]:[32]:[10.0.1.12]/272
                      10.1.2.52                         100      32768 i
*>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272
                      10.1.2.51                         100          0 i

N5K2(config-if)# sh bgp l2vpn evpn 10.0.1.11
Route Distinguisher: 10.0.0.51:32777
BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 166
Paths: (2 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW, , is locked

  Advertised path-id 1
  Path type: internal, path is valid, is best path
  AS-Path: NONE, path sourced internal to AS
    10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10100 5000
      Extcommunity: 
          RT:65535:5000
          RT:65535:10100
          ENCAP:8
          Router MAC:00de.fb12.1a7c
      Originator: 10.0.0.51 Cluster list: 10.1.2.71 

  Path type: internal, path is valid, not best reason: Neighbor Address
  AS-Path: NONE, path sourced internal to AS
    10.1.2.51 (metric 9) from 10.1.2.72 (10.1.2.72)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10100 5000
      Extcommunity: 
          RT:65535:5000
          RT:65535:10100
          ENCAP:8
          Router MAC:00de.fb12.1a7c
      Originator: 10.0.0.51 Cluster list: 10.1.2.72 

  Path-id 1 not advertised to any peer

Route Distinguisher: 10.0.0.52:32777    (L2VNI 10100)
BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 165
Paths: (1 available, best #1)
Flags: (0x00021a) on xmit-list, is in l2rib/evpn, is not in HW, 

  Advertised path-id 1
  Path type: internal, path is valid, is best path
             Imported from 10.0.0.51:32777:[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 
  AS-Path: NONE, path sourced internal to AS
    10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 10100 5000
      Extcommunity: 
          RT:65535:5000
          RT:65535:10100
          ENCAP:8
          Router MAC:00de.fb12.1a7c
      Originator: 10.0.0.51 Cluster list: 10.1.2.71 

  Path-id 1 not advertised to any peer

If all from the above checks out please consider this:

  1. Per Cisco support you should be using separate loopback address for VTEP/Unicast Underlay/BGP. Otherwise this is no supported configuration. That being said try to reconfigure your BGP peering with different loopback address. Even though I haven’t seen issues with using i.e lo0 for all.
  2. If the anycast gateway feature is enabled for a specific VNI, then the anyway gateway feature must be enabled on all VTEPs that have that VNI configured. Having the anycast gateway feature configured on only some of the VTEPs enabled for a specific VNI is not supported.
    • In plain text if you are tend to use fabric forwarding anycast gateway for seamless VM mobility across your VTEPS your SVI needs to be the part of it – otherwise remove that global command  In example if you want to vMotion your VM from one VTEP to another fabric forwarding anycast will make that transfer painless when underline VM last resort MAC address will not change and will be able to forward traffic without any interruptions. This is a good feature.

Reason why SVI was being suspended as soon as it becomes vn-segment was due to having fabric forwarding anycast enabled where SVI was not part of it.

N5K2(config-if)# sh run | in anycast

fabric forwarding anycast-gateway-mac 2020.0000.00AA
fabric forwarding mode anycast-gateway

N5K2(config-if)# sh run int vlan 10

interface Vlan10
no shutdown
vrf member transit-l3-vni
ip address 10.0.1.52/24

As soon as I enabled fabric forwarding under the SVI the suspended tag went away.

N5K2(config-if)# sh int vlan 10
Vlan10 is down (suspended), line protocol is down

N5K2(config-if)# int vlan 10
N5K2(config-if)# fabric forwarding mode anycast-gateway
N5K2(config-if)# sh int vlan 10
Vlan10 is up, line protocol is up

I have also noticed that L3VNI kicks in right away and its added to route distinguisher.

N5K2(config-if)# sh bgp l2vpn evpn 
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 170, local router ID is 10.0.0.52
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

Route Distinguisher: 10.0.0.52:3 (L3VNI 5000)
*>i[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272
10.1.2.51 100 0 i

N5K2(config-if)# sh bgp l2vpn evpn 10.0.1.11
Route Distinguisher: 10.0.0.52:3 (L3VNI 5000)
BGP routing table entry for [2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272, version 164
Paths: (1 available, best #1)
Flags: (0x000202) on xmit-list, is not in l2rib/evpn, is not in HW,

Advertised path-id 1
Path type: internal, path is valid, is best path
Imported from 10.0.0.51:32777:[2]:[0]:[0]:[48]:[001b.218d.3d98]:[32]:[10.0.1.11]/272 
AS-Path: NONE, path sourced internal to AS
10.1.2.51 (metric 9) from 10.1.2.71 (10.1.2.71)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10100 5000
Extcommunity: 
RT:65535:5000
RT:65535:10100
ENCAP:8
Router MAC:00de.fb12.1a7c
Originator: 10.0.0.51 Cluster list: 10.1.2.71

Path-id 1 not advertised to any peer

Bottom line is if you want to use anycast for your VTEPs make sure SVIs are part of it as well.  Otherwise, don’t install that global configuration.  Big thanks to MM from CCIE DC Study Group for helping me nail this down.

References:

Add a Comment

Your email address will not be published. Required fields are marked *