L2 over SD-WAN with EVPN
by Amr Masoud
Introduction
SDWAN technology was introduced to provide abstracted, simplified, and optimized WAN connectivity leveraging any type of transport connectivity.However, traditional WAN transport technologies like MPLS have been developed over decades to provide reliable connectivity and special use cases for customers. Positioning itself as the modern unified WAN solution, SDWAN must avail all traditional WAN use cases in addition to its advanced traffic management capabilities.
One of those important traditional WAN use cases is L2 connectivity between sites. Legacy FR & ATM WAN transport were used to provide L2 natively. MPLS transport (being L3 by nature) has been also developed to provide robust and scalable L2 connectivity with VPLS and EVPN services.
Being a L3 overlay routing technique, SDWAN must be also able to provide L2 connectivity between sites in a scalable way. For those vendors who implement SDWAN as matter of dynamic IPsec tunnels without advanced routing techniques, that would be an extremely challenging use case for them.
However, with its sophisticated routing architecture, Versa can easily deliver SWAN-based Overlay L2 connectivity leveraging EVPN and VXLAN (in Overlay) to provide easy and scalable L2 & L3 connectivity over SDWAN with all traffic steering/management options, genuine Multi-tenancy, and ultimate security.
The need for L2 Ethernet connectivity between sites
Since it was developed in 1973 by Xerox, and standardized by IEEE in 1980 under IEEE 802.3, Ethernet had been the de facto of L2 connectivity for all packet-based networking due its simplicity of deployment, providing a plug-and-play L2 connectivity between hosts.Due to its scalability limitations, many applications have been developed to communicate over L3 (IP) level. However, many networking applications had remained requiring the L2 Ethernet switching for communication. Those applications, and the need for L2 reachability to the L3 GW have maintained the need for L2 Ethernet switching.
For WAN & SDWAN connectivity between sites, L2 reachability had been also a requirement for some important use cases , like:
- Active-Active Data Centers
- DC Migration
- Virtual Machine mobility and availability between DCs
- Legacy applicationsDB clustering
Figure 1: Example for L2/SDWAN Use Cases
L2 Ethernet switching Operations & Limitations
To keep it simple (Plug-&-Play), unlike L3 routing, L2 Ethernet switching does not require any protocol to provide a proactive learning for L2 MAC addresses for the switches to be able to forward the L2 packets properly. Rather, L2 Ethernet relies on learning the MAC addresses in Data Plane while forwarding relying on four simple operations depending on the L2 traffic type, which are:Traffic Type | Description | Result of | Operation | |
---|---|---|---|---|
1 | Known Unicast | When destination MAC address was previously learned | One-to-One (Unicast) communication | Learn the Source MAC and send to the specific destination port where destination MAC was previously learned |
2 | Unknown Unicast | When destination MAC address is not known yet | One-to-One (Unicast) communication | Learn the Source MAC and flood to all other ports |
3 | Broadcast | Destination MAC is: • FF:FF:FF:FF:FF:FF | Host ARP, and some network applications like DHCP | Learn the Source MAC and flood to all other ports |
4 | Multicast | Destination MAC starts with: • 01:80:C2 • 01:1B:19 • 01:00:5E • 01:0C-CD • 01:00:0C | Protocols require L2 Multicast communication like LLDP, STP, OSPF, VRRP, IGMP. Or for mapping L3 multicast address to L2 Multicast address to optimize L2 multicast communication | Learn the Source MAC and flood to all other ports |
With simplicity of operation, a lot of limitations are introduced with L2 Eth switching due to BUM traffic:
- The high potential for Broadcast storms/loops when redundant paths is available that can bring the whole network down in seconds
- The burden of un-necessary flood traffic over-utilize the links and consume hosts networking processing
- Scalability limitation due to the above
- The need for complicated STP protocol to block redundant paths
- No utilization for redundant paths due to the use of STP & blocked redundant paths
Figure 2: L2 Ethernet Switching and the Potential of Broadcast Storms
Extending L2 Over L3 network
When L2 connectivity is required (as mentioned earlier) between sites that are separated with L3 network (Like the case with SDWAN), how L2 can be provisioned?Service providers have their way (Like L2VPN-VPLS or EVPN-MPLS) to provide L2 connectivity for enterprise customers. However, that leads to customer dependency on high-cost SP MPLS connectivity which contradicts with SDWAN’s promising advantages. Therefore, we need to look for a way to provide L2 connectivity over SDWAN without any dependency on SP.
Figure 3: VM Mobility requires L2 reachability over the WAN for VM to reach its L3 Gateway
To provide that L2 over SDWAN, two main things are required:
1- Tunneling/Encapsulation mechanism to transport L2 frames over L3 network
o Old protocols like L2TP can be used. However, VXLAN/UDP was developed to handle L2 eth encapsulation with great flexibility using VXLAN header, mainly VNID (VXLAN Network Identifier) & flags.
2- L2 MAC address learning mechanism to emulate L2 learning over L3 network. Two options can be used for this:
a. Data-Plane Learning & BUM flood. L2 BUM Flood can be emulated over L3 with either Ingress-replication, or L3 multicast. However, Data-Plane learning in L3 still inherits the scalability limitations of BUM forwarding over L2 networks which makes it a poor solution.
b. Control-Plane Learning for proactive L2 MAC learning over L3 network. EVPN is the only flexible, perfect, mature, and standard protocol to achieve that.
VXLAN Tunneling
Figure 4: VXLAN/UDP Tunneling structure
As shown above, VXLAN encapsulation is accomplished by means of VTEPs (VXLAN Tunnel Endpoint) capable routers/switches at the edge of each site. VTEPs adds the following fields in its encapsulation of the original L2 frame:- VXLAN header – includes VXLAN Network identifier (VNI) that acts mainly as L2 domain (BD) identifier. VNI is 24-bit, providing 16.7 million possible VNI compared to only 4,096 with VLAN ID in 802.1Q tagging.
- Outer UDP header – providing transport layer addressing and connectivity between VTEPs
- Outer IP header – providing IP connectivity between VTEPs
EVPN Control-Plane Routing
Figure 5: EVPN 9Ethernet VPN) as extension to MP-BGP (Multi-Protocol BGP)
EVPN is a set of NLRI (Network Layer reachability Information) or address families that runs over MP-BGP to transport both L2 frames & IP packets over L3 networks. EVPN/MP-BGP runs between VTEPs. The resulting routing updates are used in Data Plane for VXLAN VTEP encapsulation.Versa Implementation for EVPN-VXLAN for L2/SDWAN
With its SDWAN overlay sophisticated routing architecture that relies by nature on MP-BGP & VXLAN, Versa seamlessly implement EVPN address family (NLRI) with MP-BGP for control-plane & VXLAN tunneling for Data-Plane to provide both L2 & L3 overlay with great flexibility & scalability.Figure 6: Versa Implementation for L2-SDWAN with EVPN-VXLAN
Architecture Explained
Versa SDWAN solution does not require a new architecture to run EVPN for L2 connectivity. It is just a matter of enabling EVPN NLRI to the existing SDWAN overlay architecture that relies on MP-BGP by nature.- VOS (Versa OS) that runs at branch/site as SDWAN CPE acts as the VTEP (VXLAN Tunnel Endpoint) switch/router over any existing underlay transport connectivity
- VOS CPE at each site peers with a Central Controllers acting as Route Reflectors (RRs) with MP-BGP over a secure tunnel per each tenant. When L2 LAN interface is added to a site, EVPN address family (NLRI) is automatically activated at that site and run with Controller (as RR)
- – Controller works as part of Versa Headend that consists of Director for management, Controller for routing, and Analytics for analysis
- – Versa Headend can be deployed on-prem or as a SaaS by Versa
- – Controller is running with VOS as well
- – For redundancy, multiple RRs /Controllers can be implemented
- Based on the EVPN exchanged routing info, VOS at a site runs VXLAN/UDP encapsulation (cleartext tunnel) or VXLAN/IPsec (secure tunnel) with other sites to encapsulate L2 frames over SDWAN
EVPN Implementation with Versa
EVPN Routes
Versa’s VOS software supports the following EVPN route types, as specified in RFC 7432:Route Type | Name | Used For |
---|---|---|
Route Type 2 (RT2) | MAC/IP Advertisement Route | • Advertise MACs and/or MAC-IPs • Enables L2VNI bridging • Enables L2VNI routing • Reduces L2 BUM traffic |
Route Type 5 (RT5) | IP Prefix Route | • Advertise routes to subnet prefixes • Provides external connectivity • Helps overcome the limitation of “silent” hosts |
Route Type 3 (RT3) | Inclusive Multicast Ethernet | • Per BD Peer VTEP discovery • Help optimize BUM flood with either: - Ingress replication, or - Per BD L3 multicast group |
Route Type 1 (RT1) | Ethernet Auto-Discover (AD) Route | • Allows for standard-based ethernet multihoming and link load-sharing |
Route Type 4 (RT4) | Ethernet Segment Route | • Allows for standard-based ethernet multihoming and link load-sharing |
Outcomes of EVPN Routes
- MAC–IP proactive learning between VTEPs – eliminating the need to flood most unknown unicast (U). Achieved with Route Type-2 (RT-2)
- ARP suppression by VTEP that keeps ARP flood behind the VTEP. It makes the ARP as unicast between VTEPs. ARP forms most of L2 Broadcast (B) traffic. Achieved with RT2
- IP Prefixes learning between VTEPs. Achieved with RT5
- Defining the BD belonging members to facilitate sending the remaining BUM traffic to the concerned VTEP receivers (either with Ingress-replication or with per-BD L3 multicast group. Achieved with RT3
- Auto-discovery for branches belongs to the same BD (Bridge Domain). Achieved with RT1
- Host dual-homing to dual VTEPs for HA & Link Load-sharing with Anycast Gateway. Achieved with RT1 & RT4
Route Type-2 advertisement
Below diagram shows how RT2 is exchanged between VOS at sites via Route Reflector (Versa Controller ).Figure 7: EVPN RT2 Explained
Sequence:- VM11 sends any traffic that reaches VOS-1 (VOS acting as VTEP)
- VOS-1 builds a local entry with VM11’s IP-MAC with itself as next-hop
- VOS-1 sends RT2 EVPN route to Controller (Route Reflector)
- Controller reflects the received RT2 from VOS-1 to all other VTEPs (VOS-2 & VOS-3 in this example)
- Both VOS-VTEP2 & VOS-VTEP3 installs that Route in their routing table with VOS-VTEP1 as the next-hop
ARP suppression
As a result of learning MAC-IP of hosts proactively via RT2, local ARP can be answered by the local VTEP without a need to flood that ARP request to the other remote VTEPs.Figure 8: ARP Suppression as a result of RT2
Unicast Forwarding for unknown unicast
As a result of learning MAC-IP of hosts proactively via RT2, unknown unicast frame is encapsulated and sent as L3 unicast from source VTEP to destination VTEP.Figure 9: Unicast Forwarding for Unknown Unicast because of proactive learning via RT2
Remaining BUM traffic handling
As seen above, majority of the BUM traffic represented by unknown unicast and broadcast ARP is now handled as L3 Unicast & ARP suppression as an outcome of RT2 proactive learning.– Note: Remaining un-known unicast L2 traffic is a result of silent hosts, which is a very corner case.
The remaining BUM traffic still need to be flooded to all remote VTEPs. This can be achieved in one of two methods:a. Ingress-Replication from source VTEP to all other VTEPs belong to the same L2-VNI (Bridge Domain). Versa implements this method. EVPN RT3 helps identifying all VTEPs belong to the same BD.
b. Layer 3 multicast from source VTEP to all other VTEPs belong to the same L2-VNI (BD). Each BD is assigned with a unique L3 multicast group address. Intermediate L3 network must run IP multicast routing in this case. EVPN RT3 helps identifying all VTEPs belong to the same BD.
Inner label with EVPN
For L2/SDWAN (VOS to VOS), with RT2 & RT5, inner labels (BD-ID & VRF-ID with RT2, and VRF-ID with RT5) are advertised with the route. Those inner labels could be MPLS labels (with EVPN-MPLS implementation) or could be VXLAN VNIs (with EVPN-VXLAN implementation). Versa VTEP represented by VOS can implement the two options (with configuration choice). That is for inner labels. However, for outer encapsulation between VTEPs, VOS always use VXLAN/UDP encapsulation using VTEP IP addresses, not MPLS.For VOS to another device peering (usually toward LAN side), Versa uses standard EVPN routing & VXLAN encapsulation in compliance with RFC8365.
EVPN-VXLAN Control-Plane vs regular Ethernet Data-Plane learning & forwarding
Traffic Type | Norm Ethernet Operation | EVPN-VXLAN Operation | |
---|---|---|---|
1 | Known Unicast | Learn the Source MAC and send to the specific destination port where destination MAC was previously learned | Learning: MAC addresses are proactively learned by each VTEP for its local hosts via RT2. Forwarding: Unicast VTEP encapsulation |
2 | Unknown Unicast | Learn the Source MAC and flood to all other ports | Learning: MAC addresses are proactively learned by each VTEP for its local hosts via RT2. Forwarding: - For learned hosts, Unicast VTEP encapsulation - For silent hosts, traffic is sent to all VTEPs belong to the same BD with either ingress-replication or L3 multicast |
3 | Broadcast | Learn the Source MAC and flood to all other ports | Learning: MAC addresses are proactively learned by each VTEP for its local hosts via RT2. BD VTEP members are learned via RT3. Forwarding: - ARP is converted into unicast and used Unicast VTEP encapsulation - Rest of the Broadcast traffic is sent to all VTEPs belong to the same BD with either ingress-replication or L3 multicast. |
4 | Multicast | Learn the Source MAC and flood to all other ports | Learning: BD VTEP members are learned via RT3. Forwarding: Multicast traffic is sent to all VTEPs belong to the same BD with either ingress-replication or L3 multicast. |
SDWAN with EVPN-VXLAN advantages
Using EVPN & VXLAN for L2 over SDWAN, Versa solution provides the following advantages:
- Allows to carry L2 & L3 traffic concurrently (allowing for workload mobility) using the same overlay routing protocol with simple & unified configuration
- Proactive MAC learning in control-plane
- Optimize the L2 forwarding by:
- – Converting most of the flood traffic caused by Unknown Unicast into direct unicast forwarding between VTEPs
- – Suppressing the majority of Broadcast caused by ARP behind the VTEP and converting it to a direct unicast forwarding between VTEPs
- – Optimizing the flood of the remaining BUM traffic by either ingress-replication or L3 multicast routing
- Eliminate the potential of L2 loops/storms
- Allows for Multi-tenancy for L2 & L3 traffic
- Eliminates the need for STP & its complexity
- Utilize all available links with ECMP for L2 traffic
- Scale for massive number of L2 hosts. No limitation of traditional eth switching due to BUM traffic.
- Scale for massive number of Broadcast Domains (BD) due to VXLAN VNI address-space (24-bit)
- Standardization to allow for interoperability with other EVPN-VXLAN networks
- Allows for standard dual-homing and anycast gateway for HA and Link load-balancing
- Building interesting topologies by mean of Route Targets (RTs)
From the other side having EVPN implemented over SDWAN rather than traditional EVPN gives another great advantage, some examples below:
- Centralized provisioning (with ZTP) & management for all sites providing L2 connectivity via Versa Director or Concerto
- Allows sending L2 traffic with advanced SLA-based and App-based traffic steering options
- Allows building advanced topologies for L2 communication. Like full-mesh, hub-&-spokes, etc.
- Applying advanced security controls to L2 traffic between sites
- L2 traffic live monitoring & advanced analytics with Versa Analytics
Versa Inter-As EVPN-VXLAN/SDWAN for DC Fabric expansion
DC Fabric expansion can be another use case for Versa EVPN-VXLAN/SDWAN as shown below: In this use case VOS at each site peers with the local EVPN-VXLAN DC fabric at the LAN side with standard EVPN-VXLAN (in compliance with RFC8365) providing Inter-AS EVPN-VXLAN fabric expansion over SD-WAN.This use case can be used to extend DC Fabric over multiple sites over SDWAN providing concurrent L2 & L3 connectivity between all DC fabrics in a standard implementation.