SD-WAN transport-side BGP

cisco sd-wan transport-side BGP and DIA NAT

27 September 2022   16 min read

The majority of Cisco SD-WAN guides and posts I have found use static routing rather than routing protocols on the transport-side. Static routes are all very well for SD-WAN tunnel traffic but I was wanting to understand how you equate for DIA traffic in a more real-life situation where address ranges are advertised via BGP.

The Cisco SD-WAN BGP documentation puts a lot of emphasis on using a loopback interface rather than the physical interface for the transport-side (VPN0) BGP peering. This not the only option and I assume is meant more for if you have multiple links and want to route through an issue (like with iBGP) rather than around it (routing convergence).

The configuration for this network can be found here, in summary:

  • The SD-WAN vManage, vBond and vSmart are at a site external to the cEdge routers (CSR 1000v)
  • Each site has transport-side eBGP peers to ISPs and service-side iBGP MPLS-VPN peerings to core switches
  • A default route is advertised by the ISPs (removed later) and a /28 address range advertised out to the ISPs
  • Both sites have a Gold TLOC to one ISP with site 11 also having a Silver TLOC
  • Site 21 has a TLOC extension between routers to allow the ISP connections (TLOCs) to be shared

Table Of Contents



Transport-side (VPN0) ISP BGP peering

When a physical interface is made a tunnel-interface it becomes a hardened SD-WAN interface that will only allow certain traffic in/out such as tunnel traffic (GRE or IPsec) or native traffic (DNS, NETCONF, SSHD, etc) destined to the router itself. It is possible to allow additional services such as OSPF or BGP over the tunnel interface which is required for routing peerings to be able to form out of the physical interfaces associated to that tunnel. The configuration of the allowed services is done under the VPN0 feature template.

interface Tunnel1
 ip unnumbered GigabitEthernet1
 tunnel source GigabitEthernet1
 tunnel mode sdwan
!
interface GigabitEthernet1
 description UPLINK - ISP A
 ip address 192.168.10.2 255.255.255.252
!
sdwan
 interface GigabitEthernet1
  tunnel-interface
   encapsulation ipsec weight 1
   color gold
   allow-service bgp
   << rest of config omitted for brevity >>

The BGP configuration for this is pretty much the same as you would expect on a standard CSR setup. One nuance with SD-WAN in comparison to IOS-XE is that the BGP configuration can not reference a route-map if it does not already exist otherwise the template push will fail. The route-map must first be created as route-policy within a localised policy and attached as an Additonal Template within the device template (alternatively use CLI a feature template) before it can be referenced by BGP.

policy
 route-policy RM_ISP_IN
    sequence 1
     match
      address PL_DEFAULT
     action accept
  default-action reject
 !
 lists
  prefix-list PL_DEFAULT
   ip-prefix 0.0.0.0/0 

The resulting configuration pushed to the cEdge device is the same as it would be with IOS-XE.

ip prefix-list PL_DEFAULT seq 5 permit 0.0.0.0/0
!
route-map RM_ISP_IN permit 1
 match ip address prefix-list PL_DEFAULT
route-map RM_ISP_IN deny 65535

A BGP feature template defines the global BGP configuration with the IPv4 address family neighbor holding the peering information. A null route is required under the VPN0 template to allow for BGP to advertise the public range to the ISP.

The BGP template is attached to the VPN0 transport section of the device template along with the VPN0 and interface feature templates.

ip route 0.0.0.0 0.0.0.0 192.168.10.1
!
router bgp 65111
 bgp log-neighbor-changes
 timers bgp 9 27
 neighbor 192.168.10.1 remote-as 10
 neighbor 192.168.10.1 description ISP A
 address-family ipv4
  network 10.10.10.0 mask 255.255.255.240
  neighbor 192.168.10.1 activate
  neighbor 192.168.10.1 send-community both
  neighbor 192.168.10.1 route-map RM_ISP_IN in
  distance bgp 20 200 20
 exit-address-family

The global routing table (transport VPN0) can have multiple different default routes with static routes (AD 1) being more preferred over eBGP (AD 20) and ECMP used for routes with the same AD and metric. Default routes are only used to form the control connections (DTLS tunnels to vSmart, vBond and vManage) and DIA, the actual TLOC used for data plane traffic (and possibly DIA) is dependant on OMP and centralised policy.

NAT DIA

Enabling any of the 3 NAT types within the interface feature template will add nat outside to the interface. Additional static NAT statements can be used in collaboration with any of these types.

interface GigabitEthernet1
 ip nat outside

The configuration generated by SD-WAN transport-side NAT will look familiar to anyone who has used NAT on IOS or IOS-XE, the one difference being the lack a NAT inside interface. The nat-dia-vpn-hop-access-list or global-list ACLs are not visible in the running config or in the output of show ip access-lists, to edit this in SD-WAN a data policy is used (not sure what default is, maybe any).

NAT Interface: NATs all Internet bound traffic to the physical interface address

ip nat inside source list nat-dia-vpn-hop-access-list interface GigabitEthernet1 overload

NAT pool: NATs all Internet bound traffic to IPs from a pool of addresses

ip nat pool natpool-GigabitEthernet1-0 20.20.20.14 20.20.20.15 prefix-length 28
ip nat inside source list global-list pool natpool-GigabitEthernet1-0 overload egress-interface GigabitEthernet1

NAT Loopback: NATs all Internet bound traffic to the loopback interface address (loopback does not need NAT configured)

ip nat inside source list global-list interface Loopback99 overload egress-interface GigabitEthernet1

Static NAT: One-to-One NAT, can optionally specify the VRF which the NAT traverses

ip nat inside source static 10.11.11.51 20.20.20.12 egress-interface GigabitEthernet1
ip nat inside source static 10.11.11.52 20.20.20.13 vrf 3001 egress-interface GigabitEthernet1

To make use of the DIA NATs in a service-side VPN (VRF) a static route is required with the gateway being the VPN to route traffic from the service-side VRF to the transport-side (VPN 0).

ip nat route vrf 3001 0.0.0.0 0.0.0.0 global

This results in a NAT DIA default route being added to the VPNs routing table. By default a static route has an AD of 1, NAT DIA 6 and OMP 251, meaning that the NAT DIA route overwrites any OMP advertised default route to prefer the local Internet breakout over a remotely learned route.

# show ip route vrf 3001
       n - NAT, Ni - NAT inside, No - NAT outside, Nd - NAT DIA
       ia - IS-IS inter area, * - candidate default, U - per-user static route

n*Nd  0.0.0.0/0 [6/0], 00:02:53, Null0

The NAT DIA route is removed if the NAT transport interfaces are down or the transport static route removed. NAT maintains a translation table that tracks outbound sessions from the service-side VPNs to allow the return traffic to be sent back without having to leak service VPN routes into VPN 0. This default route can be advertised to the networks south of the cEdge router by redistributing it into BGP using redistribute nat-route dia and default-information originate.

An alternative option to routing is to use the centraslied data policy to match on a range of addresses with an action of nat use-vpn 0 to direct the matched data traffic to an Internet exit point on the local router. Is similar to PBR in the sense it will blindly pass traffic out of this interface so needs to be used in conjunction with DIA tracking and NAT fallback.

TLOC extension

The above BGP configuration worked fine until I wanted to introduce a TLOC extension into the equation. The use case for the TLOC extension is to have different ISPs on 2 different routers and allow either router to use the opposing routers connection as if it was a directly attached tunnel interface with its own TLOC color.

The TLOC extension logic is pretty straightforward, a link between routers with one end being a tunnel interface and the other end linking that interface to the actual ISP transport interface on the neigbor router (tloc-extension x). To allow for the reverse traffic and future TLOC extensions is better to use sub-interfaces, the sub-interface MTU must be 4 bytes less than the parent interface or the template push will fail.

As with the ISP tunnel interface a default route is required in the outbound direction so that the DTLS tunnels can be formed over this new tunnel interface with all controllers. The next-hop will be the IP address of the TLOC extended end of the link (has tloc-extension cmd).

Applying the static default routes broke the existing DTLS tunnels as the static route had a lower AD (1) than eBGP routes (20) meaning that the BGP route was removed from the routing table. ECMP only works for routes that are sourced by the same routing protocol (static route, BGP, etc) so changing the AD wont work. I could have probably fixed this by running BGP over the TLOC extensions but this adds too much complexity (need per-TLOC extension BGP peers and route filtering) so instead swapped the BGP learnt ISP default routes for static default routes.

S11-cEdge01

interface GigabitEthernet3
 no ip address
 negotiation auto
interface GigabitEthernet3.11
 description TLOC EXT - Tunnel end
 encapsulation dot1Q 11
 ip address 172.16.11.1 255.255.255.252
 no ip redirects
 ip mtu 1496
interface GigabitEthernet3.12
 description TLOC EXT - Extension end
 encapsulation dot1Q 12
 ip address 172.16.12.2 255.255.255.252
 no ip redirects
 ip mtu 1496
!
interface Tunnel11003
 ip unnumbered GigabitEthernet3.11
 no ip redirects
 tunnel source GigabitEthernet3.11
 tunnel mode sdwan
!
ip route 0.0.0.0 0.0.0.0 172.16.11.2
ip route 0.0.0.0 0.0.0.0 192.168.10.1
!
sdwan
interface GigabitEthernet3.12
  tloc-extension GigabitEthernet1
interface GigabitEthernet3.11
  tunnel-interface
   encapsulation ipsec weight 1
   color silver
   no allow-service all
   no allow-service bgp
   << rest of config omitted for brevity >>

S12-cEdge01

interface GigabitEthernet3
 no ip address
 negotiation auto
interface GigabitEthernet3.11
 description TLOC EXT - Extension end
 encapsulation dot1Q 11
 ip address 172.16.11.2 255.255.255.252
 no ip redirects
 ip mtu 1496
interface GigabitEthernet3.12
 description TLOC EXT - Tunnel end
 encapsulation dot1Q 12
 ip address 172.16.12.1 255.255.255.252
 no ip redirects
 ip mtu 1496
!
interface Tunnel12003
 ip unnumbered GigabitEthernet3.12
 no ip redirects
 tunnel source GigabitEthernet3.12
 tunnel mode sdwan
!
ip route 0.0.0.0 0.0.0.0 172.16.12.2
ip route 0.0.0.0 0.0.0.0 192.168.20.1
!
sdwan
interface GigabitEthernet3.11
  tloc-extension GigabitEthernet1
interface GigabitEthernet3.12
  tunnel-interface
   encapsulation ipsec weight 1
   no border
   color gold
   no allow-service all
   no allow-service bgp
   << rest of config omitted for brevity >>

Tracking

The use of a static rather than a BGP learnt default route introduces the possibility of blackholing in the event of an ISP failure. This is not a problem for traffic that would normally be sent over the tunnels (inter SD-WAN site traffic) as the BFD sessions over the tunnels would fail, however some sort of tracking is needed for DIA traffic. SD-WAN has 2 native Viptela tracker features that use HTTP port 80 probes expecting a 200 OK response.

  • DIA tracker: Tracks the status of the transport interface it is associated by periodically probing a specified addresses (interface IP used as source of probes) to measure the round-trip-time. When the latency exceeds the configured threshold the tracker considers the network as unavailable causing the router to withdraw the NAT route from service-side VPNs and re-route traffic based on the local routing table.
  • Static-route tracker (service-side VPN only): Determines if a service-side static route can be included in the routing table of a device based on the availability of the tracked endpoint address. If the tracker does not receive a response to its probes the static route is not included in the routing table and is not advertised to OMP. It is also possible configure an alternative next-hop address or a static route with a higher administrative distance to provide a backup path.

Newer versions of code support additional DIA tracker features:

  • From 17.7.1a you can configure a tracker group with two trackers (AND or OR logic) and associate this tracker group to an interface
  • Supported Interfaces for NAT DIA Tracker are Cellular, Ethernet, PPPoE, sub-interfaces (17.7.1a) and DSL dialer (17.6.x). Only one tracker or tracker group can be applied to an interface
  • DNS URL endpoint is not supported on Cisco IOS XE SD-WAN devices
  • NAT fallback feature for IOS-XE is supported only from 17.3.2

Trackers are configured under the system feature template and then associated under the relevant transport interface feature template. The interface must have NAT enabled or else the deployment will fail.

Parameter Range Default Description
Threshold 100 - 1000 300 Duration in ms to wait for the probe to return a response before declaring transport interface as down
Interval 20 - 600 60 Frequency in seconds at which a probe is sent to determine the status of the transport interface
Multiplier 1 - 10 3 Number of times a probe can be resent before declaring that the transport interface is down

endpoint-tracker umbrella
 endpoint-ip 208.67.222.222
 tracker-type interface
 threshold 200
 multiplier 3
 interval 20
!
interface GigabitEthernet1
 endpoint-tracker umbrella

A few useful commands for viewing the tracker

show endpoint-tracker records                              Show all trackers and the settings
show endpoint-tracker Show associated trackers and statistics
show endpoint-tracker tracker-group Show tracker groups
show endpoint-tracker static-route Show static-route trackers and statistics
show ip sla summary Shows used trackers and the statistics

Control and Data plane

Cisco SD-WAN uses DTLS or TLS tunnels for the control plane and IPsec or GRE tunnels for the data plane.

Control plane

A DTLS tunnels is formed to every vSmart controller over each transport tunnel interface (TLOC color) of the WAN Edge device. As the TLOC extensions are also tunnel interfaces control connections are also formed over them to each vSmart controller. Only one DTLS tunnel is formed to the vManage over the local transport interface, not over the TLOC extension.

show sdwan control local-properties
show sdwan control connections
show sdwan control connection-history


S11-cEdge01#show sdwan control connections
                                                                                       PEER                                          PEER                                          CONTROLLER
PEER    PEER PEER            SITE       DOMAIN PEER                                    PRIV  PEER                                    PUB                                           GROUP
TYPE    PROT SYSTEM IP       ID         ID     PRIVATE IP                              PORT  PUBLIC IP                               PORT  LOCAL COLOR     PROXY STATE UPTIME      ID
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
vsmart  dtls 192.168.1.3     1          1      100.100.100.3                           12346 100.100.100.3                           12346 silver          No    up     1:19:32:26  0
vsmart  dtls 192.168.1.4     1          1      100.100.100.4                           12346 100.100.100.4                           12346 silver          No    up     1:19:32:26  0
vsmart  dtls 192.168.1.3     1          1      100.100.100.3                           12346 100.100.100.3                           12346 gold            No    up     1:19:55:09  0
vsmart  dtls 192.168.1.4     1          1      100.100.100.4                           12346 100.100.100.4                           12346 gold            No    up     1:19:55:09  0
vmanage dtls 192.168.1.1     1          0      100.100.100.1                           12346 100.100.100.1                           12346 gold            No    up     1:19:55:09  0

OMP peerings

OMP peers are established between from the system IPs of the routers to each vSmart controller over the DTLS tunnels. OMP peerings are per-device, unlike DTLS tunnels they are NOT per-transport. vSmarts act as route reflectors advertising all prefixes they learn from WAN Edge routers back out to all other WAN Edge routers (RR clients).

show sdwan omp peers
show sdwan omp routes
show sdwan omp routes [vpn <vrf>  <x.x.x.x/x> [detail]]
show sdwan omp routes [vpn <vrf>  <x.x.x.x/x> [advertised | received]]
show sdwan omp tlocs [table]                          Table doesnt work on some platforms so need to use include
show sdwan omp tlocs | in tloc|peer      |site-id|bfd|gold|silver
show sdwan omp services [<service>]
show sdwan ip fib vpn <vpn>


S11-cEdge02#show sdwan omp peers
R -> routes received               I -> routes installed              S -> routes sent
 
                         DOMAIN    OVERLAY   SITE
PEER             TYPE    ID        ID        ID        STATE    UPTIME           R/I/S
------------------------------------------------------------------------------------------
192.168.1.3      vsmart  1         1         1         up       1:20:16:18       3/1/2
192.168.1.4      vsmart  1         1         1         up       1:20:16:18       3/0/2

OMP advertises three types of routes to all routers:

  • OMP routes: Local service-side routes used for reachability between endpoints at different sites
  • TLOCs routes: Identifiers that tie an OMP route to a physical location. Is the only entity of the OMP domain that is visible to the underlying network so it must be reachable via the underlay
  • Service routes: Identifiers that tie an OMP route to a service in the network (firewall, IPS, load balancer, etc), specifying the location of the service in the network. Service route information is carried in both service and OMP routes

It is by choosing what TLOCs are advertised that the SD-WAN centralised control policy forms the network topology, for example hub-and-spoke or full mesh. A router will only put a route in its routing table if it has a route to the next-hop TLOC of that route (similar to how BGP will not add a route if the next-hop is inaccessible).

Data plane

IPsec tunnels are created between the TLOCs (keys are exchanged in OMP updates) of differing WAN Edge devices with BFD sessions (also used for AAR Quality of Experience measurements) then formed over these tunnels to establish the data plane. It is over these tunnels that traffic is routed using OMP route information learnt via OMP over the control connections.

IPsec tunnels and BFD sessions are never formed between any WAN edge devices in the same site. In this setup TLOC extensions are used to allow the same type of redundancy that you would have with iBGP peers between edge routers in the same site in a traditional network setup.

show sdwan ipsec inbound-connections
show sdwan ipsec outbound-connections
show sdwan bfd sessions
show sdwan bfd history
show sdwan tunnel statistics table
show ip route vrf <vpn>
show sdwan ip fib vpn <vpn>


S21-cEdge01#show sdwan bfd sessions
                                    SOURCE TLOC      REMOTE TLOC                                      DST PUBLIC                      DST PUBLIC         DETECT      TX
SYSTEM IP        SITE ID  STATE       COLOR            COLOR            SOURCE IP                       IP                              PORT        ENCAP  MULTIPLIER  INTERVAL(msec) UPTIME          TRANSITIONS
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
192.168.11.1     11       up          gold             silver           192.168.30.2                    20.20.20.14                     12406       ipsec  7           1000           1:18:46:58      0
192.168.11.1     11       up          gold             gold             192.168.30.2                    192.168.10.2                    12386       ipsec  7           1000           1:19:00:09      2
192.168.11.2     11       up          gold             gold             192.168.30.2                    10.10.10.14                     12406       ipsec  7           1000           1:18:46:52      0
192.168.11.2     11       up          gold             silver           192.168.30.2                    192.168.20.2                    12386       ipsec  7           1000           1:19:00:17      2

In this example topology the vSmart controllers (route-reflectors) advertise (reflect) prefixes learnt from S11-cEdge01 back to S11-cEdge02. S11-cEdge02 does NOT put these routes in local routing table as they are marked as INV, U (invalid, TLOC unresolved) as the next-hop (TLOC, so tunnel of S11-cEdge01) is unreachable as the BFD tunnel (data plane) to it is down.

S11-cEdge02#show sdwan omp routes
Code:
C   -> chosen        I   -> installed            Red -> redistributed        Rej -> rejected      L   -> looped        IA  -> On-demand inactive
R   -> resolved      S   -> stale                Ext -> extranet             Inv -> invalid       Stg -> staged        U   -> TLOC unresolved
                                            PATH                      ATTRIBUTE
VPN    PREFIX              FROM PEER        ID     LABEL    STATUS    TYPE       TLOC IP          COLOR            ENCAP  PREFERENCE
--------------------------------------------------------------------------------------------------------------------------------------
3001   10.11.11.0/24       0.0.0.0          75     1002     C,Red,R   installed  192.168.11.2     gold             ipsec  -
                           0.0.0.0          76     1002     C,Red,R   installed  192.168.11.2     silver           ipsec  -
                           192.168.1.3      14     1002     Inv,U     installed  192.168.11.1     gold             ipsec  -
                           192.168.1.3      15     1002     Inv,U     installed  192.168.11.1     silver           ipsec  -
                           192.168.1.4      13     1002     Inv,U     installed  192.168.11.1     gold             ipsec  -
                           192.168.1.4      14     1002     Inv,U     installed  192.168.11.1     silver           ipsec  -
3001   10.21.21.0/24       192.168.1.3      13     1002     C,I,R     installed  192.168.21.1     gold             ipsec  -
                           192.168.1.4      12     1002     C,R       installed  192.168.21.1     gold             ipsec  -

S11-cEdge02#show sdwan omp tlocs | in tloc|peer      |site-id|bfd|gold|silver
tloc entries for 192.168.11.1
                 gold
peer            192.168.1.3
     bfd-status        down
     site-id           11
peer            192.168.1.4
     bfd-status        down
     site-id           11
tloc entries for 192.168.11.1
                 silver
peer            192.168.1.3
     bfd-status        down
     site-id           11
peer            192.168.1.4
     bfd-status        down
     site-id           11
tloc entries for 192.168.11.2
                 gold
peer            0.0.0.0
     bfd-status        up
     site-id           11
tloc entries for 192.168.11.2
                 silver
peer            0.0.0.0
     bfd-status        up
     site-id           11
tloc entries for 192.168.21.1
                 gold
peer            192.168.1.3
     bfd-status        up
     site-id           21
peer            192.168.1.4
     bfd-status        up
     site-id           21

For anything SD-WAN based you cant go wrong with the free content on Network Academy, it is better than you will find on many of the paid courses out there. Below are a few of the Cisco guides on the subjects mentioned.

https://www.cisco.com/c/en/us/td/docs/routers/sdwan/configuration/routing/vEdge-20-x/routing-book/m-unicast-routing.html https://www.cisco.com/c/en/us/td/docs/routers/sdwan/configuration/system-interface/ios-xe-17/systems-interfaces-book-xe-sdwan/track-static-route-ios-xe.pdf https://www.cisco.com/c/en/us/td/docs/routers/sdwan/configuration/nat/nat-book-xe-sdwan/configure-nat.html