A SD-WAN network is an overlay consisting of point-to-point tunnels between the branch devices. To build this overlay which is practically the SD-WAN data plane, the topology should be discovered. With a Versa Networks SD-WAN deployment, the Site ID and site information for each branch device is advertised to all remote CPE devices using MP-BGP. To have this information’s spread to all appliances there is one or more devices acting as route reflectors for the branches. These devices are called controllers and they don’t play any role in the data plane.

When a Versa CPE device receives a node advertisement from the controller(s), the CPE creates a pair of tunnels (one clear and one secure) to the advertised remote CPE tunnel endpoint. It then maps the forwarding process to the WAN transport interface that will be used to reach the WAN transport interface of the remote node. These tunnels are created automatically. BGP is also used to provide reachability between the branches using the vpnv4 address family (AFI/SAFI 1/128).

So, BGP sessions running between the branches and the controllers are essentials in the SD-WAN discovery, data plane overlay creation and reachability distribution between sites.

Let’s see what happens when the control plane/BGP is affected (e.g controller(s) become isolated or fail). Obviously would not be possible to onboard/discover new appliances. However, there is a BGP capability, termed “Graceful Restart Capability” that helps in preserving forwarding state during a BGP issue.

BGP Graceful Restart Mechanism

Normally when the BGP control plane in a router fails, its BGP sessions to the peers are lost. Upon detection of the BGP session failure, peers withdraw all routes associated with the failed session.

Exchange of Graceful Restart Capabilities

The BGP Graceful Restart mechanism defines protocol extensions that allow a speaker to indicate its BGP GR capability to its peers during the initial BGP session-establishment process. The neighboring speakers exchange graceful restart capability in the BGP Open messages. As a result of having exchanged the graceful restart capability, when a router restarts/fails that had previously expressed its GR capability, its neighbors retain routes learned from the restarting router, but mark them as stale. Basically during the BGP session-establishment process a router tells to its peers the following: “If I do not reestablish the failed session within a certain time (indicated in the Restart Time field of the capability), you should delete all the routes that were being retained across the restart/problem and stop forwarding traffic on those routes. Till that moment please mark the routes as stale and continue to use them.”  

We’ll use the bellow topology to present the BGP Graceful Restart capability

Versa Networks SD-WAN uses a special Virtual Router (VR) construct for building the topology. Its name is Control VR and the BGP sessions between the Controller/RR and the branches are established in this VR. There is a separate Control VR and BGP sessions with the Controllers per tenant.

We have two possibilities:

  1. Graceful Restart Capability exchanged between branches and controllers (default option)
  2. Graceful Restart Capability not used between branches and controllers.

We start the discussion with the second one:

Graceful Restart Capability not used between branches and controllers.

Graceful Restart capability is exchanged between branches and controller(s) in the default configuration. Let’s see what the consequences are if the feature is disabled on Branch1 and there is an issue with the BGP session between the branch appliance and the controller. As it can be seen in the bellow output Branch2 (10.0.160.102) supports the BGP GR capability, but Branch1 does not.

admin@Controller-1-cli> show bgp graceful-restart detail Tenant1-Control-VR
routing-instance: Tenant1-Control-VR
BGP instance: 2
Peer: 10.0.160.101
Peer does not support Restarter functionality
Restart-Family:
Stale-routes:
BGP instance: 2
Peer: 10.0.160.102
Peer supports Restarter functionality
Restart-Family: inet-unicast inet-vpn inet-versa-private inet-multicast inet6-unicast inet6-vpn inet6-multicast
  Stale-routes:

Branch1 receives Branch2 prefixes via the BGP session with the controller and installs them in the Tenant1 LAN VRF:

admin@Branch1-cli> show route table l3vpn.ipv4.unicast routing-instance Tenant1-Control-VR receive-protocol bgp
Routes for Routing instance : Tenant1-Control-VR  AFI: ipv4  SAFI: unicast
Routing entry for 192.168.3.0/24
  Peer Address       : 10.0.160.1
  Route Distinguisher: 3L:3
  Next-hop           : 10.0.160.102
  VPN Label          : 24705
  Local Preference   : 110
  AS Path            : N/A
  Origin             : Igp
  MED                : 0
  Community          : [ 8009:8009 ]
  Extended community : [ target:3L:3 ]
  Preference         : Default
admin@Branch1-cli> show route routing-instance Tenant1-LAN-VR 192.168.3.0/24
Routes for Routing instance : Tenant1-LAN-VR  AFI: ipv4  SAFI: unicast
[+] - Active Route
Routing entry for 192.168.3.0 (mask 255.255.255.0) [+]
Known via 'BGP', distance 200
  Redistributing via BGP
  Last update from 10.0.160.102 00:04:42 ago
Routing Descriptor Blocks:
* 10.0.160.102 , via Indirect 00:04:42 ago

One point in time the Controller encounters an issue and as a result, the following events can be seen in the log:

Controller address (10.0.160.1/32) is removed from the routing table:
20-06-18 01:51:24.00: i3async.c    0984: DELETE of dest 10.0.160.1/32, nexthop 0.0.0.0
Routes received from the controller (BGP RR) are re-evaluated:
20-06-18 01:51:24.00: qbdcnhr.c    0508(Tenant1-Control-VR): Route to BGP NH 10.0.160.1 gone; update routes
Branch2’s prefixes are deleted from the LAN VRF routing table:
20-06-18 01:51:24.00: rtds_vstated 0241: Sending DEL for route dest 192.168.3.0/24, RTT 23

Following the controller failure, Brach1 removes the remote branches dynamic tunnels:

admin@Branch1-cli> show interfaces dynamic-tunnels                                                             REMOTE
LOCAL                                                                             SITE    TUNNEL     REMOTE SITE
NAME       INTERFACE  TENANT    VRF                  LOCAL IP      REMOTE IP   OPER   ADMIN  ID      TYPE       NAME
------------------------------------------------------------------------------------------------------------------------------
dtvi-0/52  tvi-0/2.0  Provider  Provider-Control-VR  10.0.64.101   10.0.64.1   up     up     1       cleartext  Controller-1
dtvi-0/55  tvi-0/4.0  Tenant1   Tenant1-Control-VR   10.0.128.101  10.0.128.1  up     up     1       cleartext  Controller-1
ptvi1      tvi-0/3.0  Provider  Provider-Control-VR  10.0.96.101   10.0.96.1   pdown  up     1       secure     Controller-1
ptvi2      tvi-0/5.0  Tenant1   Tenant1-Control-VR   10.0.160.101  10.0.160.1  pdown  up     1       secure     Controller-1

and the prefixes learned via these tunnels:

admin@Branch1-cli> show route routing-instance Tenant1-LAN-VR 192.168.3.0/24
Routes for Routing instance : Tenant1-LAN-VR  AFI: ipv4  SAFI: unicast

Now let’s see what happens when the BGP Graceful Restart is negotiated and agreed between Branch1 appliance and the controller.

Graceful Restart Capability is used between branches and controllers.

The Graceful Restart capability is enabled by default for all deployed organizations. The specific configuration is part of the BGP Control VR for the respective tenant/organization:

admin@Branch1-cli> show configuration routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart | display set
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart enable
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart maximum-restart-time 3600
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart recovery-time 3600
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart select-defer-time 30
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart stalepath-time 3600
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart dynamic-peer-restart-time 3600
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart multiplier 8
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart helper enable
set routing-instances Tenant1-Control-VR protocols bgp 2 graceful-restart family inet-vpn unicast forwarding-state-bit

And the same information in the Versa Director GUI, advanced BGP configuration in the tenant’s control VR:

BGP capability (capability code 64) is negotiated by BGP peers (Branch1 and Controller) when the session is establishing, via Open messages:

20-06-18 02:15:57.00: qbnmutil.c   0948(Tenant1-Control-VR):   Capability code : 64, length : 38

The Controller/RR encounters  a failure:

20-06-18 02:51:24.00: i3async.c    2478: DELETE of dest 10.0.160.1/32, nexthop 0.0.0.0

However, this time the BGP information coming from the controller/RR are marked as stale:

20-06-18 02:51:24.00: rtd_bgp_serv 2908: Mark SD-WAN Info from BGP (RID 10.0.160.101 RTI Tenant1-Control-VR RTI-ID 21) stale.

The Branch2’s prefix (192.168.3.0/24) is also marked as stale in the RIB-IN

20-06-18 02:51:24.00: qbrautil.c   2600(Tenant1-Control-VR): stale route 6.8.17.0.2.0.0.0.3.0.3.192.168.3/112 to ADJ-RIB-IN 1 0 1 0x7fe8a858eb40

Checking from Versa OS cli we see also that the Branch2’s vpnv4 prefix is marked as stale:

admin@Branch1-cli> show route table l3vpn.ipv4.unicast routing-instance Tenant1-Control-VR receive-protocol bgp 192.168.3.0/24
Routes for Routing instance : Tenant1-Control-VR  AFI: ipv4  SAFI: unicast
Routing entry for 192.168.3.0/24
  Peer Address       : 10.0.160.1
  Route Distinguisher: 3L:3
  Next-hop           : 10.0.160.102
  VPN Label          : 24705
  Local Preference   : 110
  AS Path            : N/A
  Origin             : Igp
  MED                : 0
  Community          : [ 8009:8009 ]
  Extended community : [ target:3L:3 ]
  Preference         : Default
   Graceful Restart   : Stale

This time Brach1 does not remove the remote branches dynamic tunnels and the prefixes learned via these tunnels:

admin@Branch1-cli> show interfaces dynamic-tunnels | grep Tenant1
dtvi-0/55  tvi-0/4.0  Tenant1   Tenant1-Control-VR   10.0.128.101  10.0.128.1    up     up     1       cleartext  Controller-1
dtvi-0/89  tvi-0/4.0  Tenant1   Tenant1-Control-VR   10.0.128.101  10.0.128.104  up     up     104     cleartext  Branch4
dtvi-0/90  tvi-0/5.0  Tenant1   Tenant1-Control-VR   10.0.160.101  10.0.160.104  up     up     104     secure     Branch4
dtvi-0/91  tvi-0/4.0  Tenant1   Tenant1-Control-VR   10.0.128.101  10.0.128.102  up     up     102     cleartext  Branch2
dtvi-0/92  tvi-0/5.0  Tenant1   Tenant1-Control-VR   10.0.160.101  10.0.160.102  up     up     102     secure     Branch2
dtvi-0/93  tvi-0/4.0  Tenant1   Tenant1-Control-VR   10.0.128.101  10.0.128.103  up     up     103     cleartext  Branch3
dtvi-0/94  tvi-0/5.0  Tenant1   Tenant1-Control-VR   10.0.160.101  10.0.160.103  up     up     103     secure     Branch3
ptvi2      tvi-0/5.0  Tenant1   Tenant1-Control-VR   10.0.160.101  10.0.160.1    pdown  up     1       secure     Controller-1

admin@Branch1-cli> show route routing-instance Tenant1-LAN-VR 192.168.3.0/24
Routes for Routing instance : Tenant1-LAN-VR  AFI: ipv4  SAFI: unicast
[+] – Active Route
Routing entry for 192.168.3.0 (mask 255.255.255.0) [+]
Known via ‘BGP’, distance 200,
  Redistributing via BGP
  Last update from 10.0.160.102 00:05:57 ago
Routing Descriptor Blocks:
* 10.0.160.102 , via Indirect 00:05:57 ago

Data plane is also not affected as communication between a Branch1 host and a Branch2 host was not impacted by the controller (control plane) failure:

gns3@Linux1:~$ ip a sh ens4
3: ens4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
   link/ether 0c:9f:26:0c:15:01 brd ff:ff:ff:ff:ff:ff
   inet 192.168.2.2/24 brd 192.168.2.255 scope global noprefixroute ens4
      valid_lft forever preferred_lft forever
   inet6 fe80::799e:7b8c:1f99:62d0/64 scope link noprefixroute
      valid_lft forever preferred_lft forever
gns3@Linux1:~$ ping -c 1 192.168.3.2
PING 192.168.3.2 (192.168.3.2) 56(84) bytes of data.
64 bytes from 192.168.3.2: icmp_seq=1 ttl=62 time=4.67 ms
--- 192.168.3.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 4.675/4.675/4.675/0.000 ms

The routes are kept as stale in the RIB and FIB for Stale Path Time x multiplier time. So Versa Secure SD-WAN Network data plane would survive using default parameter values for 8 hours (3600 sec x 8) following a complete control plane failure.   

To check if the GR BGP capability is used and how much time will pass till the Stale Path Time will expire (routes will be flushed from RIB/FIB) the bellow CLI command could be used:

admin@Branch1-cli> show bgp graceful-restart brief Tenant1-Control-VR neighbor-ip 10.0.160.1
Neighbor        GR-Cap   StaleNlri    GR-Time/Stale-Path-Time
10.0.160.1      Recvd    --          28491

For more information regarding BGP Graceful Restart capability and its parameters tuning please visit Versa Networks’ official documentation.