The Border Gateway Protocol (BGP) is the primary routing protocol used between the tens of thousands of different networks that make up the global Internet. Unfortunately, the original conception of BGP presumed a fundamental level of trust between all of the participating networks, which has repeatedly permitted both major and minor outages across the Internet due to networks accepting incorrect routing information. Either deliberately or accidentally, networks are able to advertise more specific prefix routing information for address space controlled by other networks to their peers over BGP, which causes that traffic to flow through their network instead of to the intended recipient. To try and prevent these failures to properly filter BGP peers, a robust source of information on intended BGP advertisements has been needed to be able to filter out incorrect routing information.
Resource Public Key Infrastructure
One of the major additions to BGP peering to help improve the security of advertised prefixes has been the Resource Public Key Infrastructure (RPKI), which is a Public Key Infrastructure which allows each IP address holder to cryptographically attest to which of their prefixes should be expected to be advertised on the Internet from which originating Autonomous System Numbers (ASNs). Each of these statements, or Route Origination Authorizations (ROA) are then signed by one of the five Regional Internet Registries (AFRINIC, APNIC, ARIN, LACNIC, and RIPE NCC) which act as the five roots of trust in RPKI.
This allows the holder of a block of IP addresses to make an authenticated statement that “this block of IP addresses, will only be announced by that autonomous system number, with these allowed prefix lengths”. These statements are then signed by their RIR, and published so that every other network can download the ROA and use that to build a local filter that only accepts routing information from BGP peers that complies with these statements. If a malicious network were to try and hijack a portion of someone else’s address block by announcing a more specific route, even if they spoofed the correct origin ASN in the path, the prefix could be rejected due to not matching the correct length (i.e. a ROA only permitting 2001:db8:1000::/44 to be announced as a /44 would prevent anyone from accepting 2001:db8:1000::/48 announcements from other networks).
Unfortunately, a major impediment to adding cryptographic authentication to BGP has always been that, as the size of the Internet routing table continues to grow, expecting every Internet core router to perform relatively expensive cryptographic operations for every IP prefix can add an unacceptable load to the router’s control plane supervisor for older or busier routers with large numbers of active BGP sessions. To correct this, a part of the RPKI protocol stack is the RPKI to Router (RTR) protocol, which allows a network operator to decouple RPKI authentication from their network routers running BGP.
This RPKI To Router protocol allows a network operator the choice to run an RPKI Validator on a dedicated system alongside their routers to collect and validate all of the RPKI ROAs, and then export the results to BGP routers over RTR while minimizing the processing burden placed on those router’s control planes. It also means that a network operator has the choice of which RPKI validator they want to deploy for their network, such as:
The only requirement is that the validator supports the industry standard RTR protocol as defined in RFC 6810 to make the validated ROAs available to routers.
EOS Support for RPKI
With the release of EOS 4.24.0F, we are excited to announce that Arista’s switches now include support for RPKI via RTR, so adding cryptographic security to your BGP routing policy is now possible with EOS and any appropriate third party RPKI validator.
For larger network operators, it may still make sense to offload the RPKI validation to a dedicated server, but remember that the E in EOS stands for Extensible! In this article I’m going to be showing how to run an RPKI validator on an Arista switch itself, so that adding RPKI support to a network’s routing policy doesn’t require any additional hardware or rack space for a dedicated system to validate the RPKI ROAs, assuming that adding this computational and storage burden to the switches’ control plane is operationally acceptable.
To follow this guide, you’re going to need the following:
- An Arista platform running at least EOS 4.24.0F, with support for RPKI RTR.
- Sufficient available RAM and storage on the switch to support running the RPKI validator and storing the complete ROA database (approximately 1GB of each).
- Running BGP in Multi-Agent mode to enable RTR support.
There are many ways to run and manage a guest daemon in EOS, but one of the simplest ways and what we’ll use for this article is to install the NLnet Lab’s Routinator daemon as a Docker container. Containers allow you to run a guest process in its own sandbox to minimize the dependencies and ways that the contained application can interact with the rest of the local operating system, without the additional overhead of running an entire virtual machine with its own Linux kernel and copy of the entire required userspace.
EOS ships with support for Docker, but the service is not normally enabled, and normally stores the containers in a tmpfs file system which is lost when the system is reloaded. This can be fine for many applications, but we would rather run Docker from local persistent storage such that every time we boot EOS we don’t need to reinstall the Routinator container and Routinator will only need to download the recent changes to its ROA database instead of synchronizing the full database from all of the RPKI servers. To support this, we can write a short shell script to create a persistent folder on the optional SSD available on many Arista models which shows up in EOS as drive: (/mnt/drive/ in Linux), bind mount this folder to map it to /var/lib/docker in the Linux file system, and finally start the docker service to enable running containers.
#!/bin/bash # Shell script to mount a persistent folder for docker and start the service on boot DOCKERPERSIST="/mnt/drive/docker" mkdir -p $DOCKERPERSIST mount --bind $DOCKERPERSIST /var/lib/docker service docker start
I saved this script as /mnt/drive/bin/startdocker.sh and made it executable by running “chmod +x /mnt/drive/bin/startdocker.sh”. We then want to create an event handler to run this script when EOS boots such that the Docker directory is mounted and the service launched. Docker will then automatically run any enabled containers installed in this persistent storage.
event-handler startdocker trigger on-boot action bash sudo /mnt/drive/bin/startdocker.sh
There is no requirement that Docker is run from the “drive:” SSD partition. Some Arista devices come with enough storage in “flash:” to support running Docker from there, as well as other storage options such as a USB drive or mounting storage over the network.
After enabling Docker, installing and configuring the Routinator container is relatively straight forward, following the instructions on its GitHub. The main difficulty is that the ARIN Trust Anchor Locator (TAL) requires an unusual license agreement unlike any other root certificate from a Certificate Authority, so Routinator is unable to ship with that TAL included. To work around this, we need to manually create a Docker volume specifically for the five TALs and run the container to explicitly accept the ARIN “Relying Party Agreement” license and fetch their TAL.
SW1# bash sudo docker volume create routinator-tals SW1# bash sudo docker run --rm -v routinator-tals:/home/routinator/.rpki-cache/tals nlnetlabs/routinator init -f --accept-arin-rpa
Docker will automatically load all of the container dependencies for nlnetlabs/routinator, and then map the newly created volume to the path “/home/routinator/.rpki-cache/tals” inside the container and save the five trust anchors inside that directory. We are then ready to launch Routinator and map its RTR port to our switches’ loopback interface to be available for EOS.
SW1# bash sudo docker run -d --restart=unless-stopped --name routinator -p [::1]:323:3323 -v routinator-tals:/home/routinator/.rpki-cache/tals nlnetlabs/routinator
After a few minutes of downloading and processing all of the Internet’s ROAs, Routinator will be running to answer RTR queries on the local system. In this example, we’re running RTR just on the loopback interface since plain RTR is unauthenticated, but you could also consider making this RPKI validator available to other devices using RTR across the network, depending on your exact requirements. Note that Routinator runs RTR on port 3323 by default instead of the standard 323 to avoid needed root privileges to bind to a port less than 1024, but Docker gives us the flexibility to map it to the correct port outside the container.
BGP Peering Example
For the rest of the article, we’re going to look at a simple BGP peering example showing how to use this new RPKI feature and what it looks like in EOS. Since this example is using documentation ASNs and prefixes, appreciate that the RPKI Route Origination Authorizations shown don’t actually exist, so don’t expect this example to work in your lab locally (without extra work to create your own trust anchor to spoof ROAs!).
In this scenario, we’re going to be running an Arista switch as AS64501, peering with another network running AS64502. The network operators for AS64502 have signed and published a ROA for their 2001:db8:1000::/44 address space, with a maximum length set to /44, preventing the announcement of any more specific prefixes, or any announcements of the address space by other autonomous systems than themselves.
To support RPKI filtering on this peer, we’re going to need to enable the multi-agent routing model, install and start the Routinator container as covered above, enable IP routing, and configure a BGP neighborship with AS64502.
service routing protocols model multi-agent ! interface Ethernet1 no switchport ipv6 address 2001:db8::1/64 ! event-handler startdocker trigger on-boot action bash sudo /mnt/drive/bin/startdocker.sh ! ipv6 unicast-routing ! router bgp 64501 router-id 0.64.50.1 no bgp default ipv4-unicast neighbor 2001:db8::2 remote-as 64502 neighbor 2001:db8::2 maximum-routes 10 ! rpki cache local-routinator host ::1 ! rpki origin-validation ebgp local
A reload will be required to enable the multi-agent model, and will also start Docker to allow us to install Routinator.
We can confirm that we’re connected to our local RPKI validator once it’s running by asking EOS to display its ROA cache status.
SW1#show bgp rpki cache local-routinator: Host: ::1 port 323 VRF: default Refresh interval: 653 seconds Retry interval: 600 seconds Expire interval: 7200 seconds Preference: 5 Protocol version: 1 State: synced Last update sync: 0:01:04 ago Last full sync: 0:10:31 ago Last serial query: 0:01:04 ago Last reset query: 0:10:43 ago Entries: 138026 Connection: Active (0:10:43)
Remember that BGP still accepts everything by default; we need to write a route-map to evaluate all the received routes from AS64502 in light of RPKI and apply that route-map to our neighbor:
route-map rmap-rpki-rejectinvalid deny 10 description Reject all RPKI Invalid routes match origin-as validity invalid ! route-map rmap-rpki-rejectinvalid permit 20 description Implement the rest of your BGP policy ! ip as-path access-list 1 permit ^$ any ! route-map rmap-localonly permit 10 description Only export our locally originated prefixes match as-path 1 ! router bgp 64501 router-id 0.64.50.1 no bgp default ipv4-unicast neighbor 2001:db8::2 remote-as 64502 neighbor 2001:db8::2 maximum-routes 10 ! address-family ipv6 neighbor 2001:db8::2 activate neighbor 2001:db8::2 route-map rmap-rpki-rejectinvalid in neighbor 2001:db8::2 route-map rmap-localonly out ! rpki cache local-routinator host ::1 ! rpki origin-validation ebgp local
In this example, we have the rmap-rpki-rejectinvalid route-map, which doesn’t implement any BGP policy except for rejecting any prefixes which evaluate as conflicting with an RPKI ROA. Prefixes can be classified as one of three different RPKI states:
- Valid – This means that the prefix matches an existing ROA, so you can be confident that it is being advertised as intended per RPKI.
- Unknown – This means no ROA exists for the prefix, which is likely due to the fact that the holder of the address space simply hasn’t implemented RPKI yet for their resources, and you’ll need to fall back to other traditional BGP filtering methods (and encourage your BGP peer to consider publishing RPKI ROAs for their address resources).
- Invalid – This means that the received prefix directly conflicts with an existing ROA for the same address space. This likely means that someone is trying to hijack a prefix or the network is advertising a prefix which isn’t intended. In either case, the recommended action is to completely drop the prefix and not make any routing decisions based on it.
A common misconception about RPKI invalid prefixes has been that they should still be accepted but with a lowered local-preference. This is incorrect, and defeats the purpose of RPKI since a more specific longest prefix match will still supersede a less specific BGP route, regardless of the local-preference. In the current example, if a different, malicious, network wanted to hijack the 2001:db8:1002::/48 subnet and advertised it to their BGP peers, applying a lower local-preference wouldn’t protect AS64502 from the attack. 2001:db8:1002::/48 with a BGP local preference of 5 would still impact routing for that subnet, even if the same router had the correct 2001:db8:1000::/44 local-preference 100 route in their routing table.
To verify that we’re correctly accepting and classifying the 2001:db8:1000::/44 from our peer AS64502, we can ask EOS to show the accepted routes for that BGP neighbor:
SW1#show bgp neighbors 2001:db8::2 routes BGP routing table information for VRF default Router identifier 0.64.50.1, local AS number 64501 Route status codes: s - suppressed, * - valid, > - active, # - not installed, E - ECMP head, e - ECMP S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast % - Pending BGP convergence Origin codes: i - IGP, e - EGP, ? - incomplete RPKI Origin Validation codes: V - valid, I - invalid, U - unknown AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop Network Next Hop Metric LocPref Weight Path * > V 2001:db8:1000::/44 2001:db8::2 0 100 0 64502 ?
Note how the 2001:db8:1000::/44 prefix is correctly marked with “V” on the left for “valid” as it matches the RPKI ROA for “AS64502 origin for 2001:db8:1000::/44 with a maximum length 44”!
Now let’s say that one of the network operators at AS64502 is having a bad day, and accidentally makes some typos in their BGP configuration and unintendedly advertises one of their more specific prefixes, which would adversely impact their routing, and (completely innocently) accidentally advertises the prefix 2001:db8:2000::/44, which belongs to a totally different network AS64503.
Thankfully, AS64503 also uses RPKI, and has published a ROA for their 2001:db8:2000::/44 address space: “AS64503 origin for 2001:db8:2000::/44 with a maximum length of 44”. Since both of these new BGP advertisements conflict with existing RPKI ROAs, our switch is able to correctly reject both of these prefixes as RPKI invalid. The “show bgp neighbors 2001:db8::2 routes” will still look the same, but we can check for these invalid prefixes by looking at the received routes from 2001:db8::2, which isn’t limited to accepted routes per the route-map:
SW1#show bgp neighbors 2001:db8::2 received-routes BGP routing table information for VRF default Router identifier 0.64.50.1, local AS number 64501 Route status codes: s - suppressed, * - valid, > - active, # - not installed, E - ECMP head, e - ECMP S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast % - Pending BGP convergence Origin codes: i - IGP, e - EGP, ? - incomplete RPKI Origin Validation codes: V - valid, I - invalid, U - unknown AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop Network Next Hop Metric LocPref Weight Path * > V 2001:db8:1000::/44 2001:db8::2 0 100 0 64502 ? I 2001:db8:1000::/48 2001:db8::2 0 100 0 64502 ? I 2001:db8:2000::/44 2001:db8::2 0 100 0 64502 ?
Both of these invalid prefixes are marked with an “I”, and automatically rejected by our rmap-rpki-rejectinvalid route map. Neither of them are allowed to become active and cause routing issues for any of the networks involved; 2001:db8:1000::/48 is rejected because the length 48 conflicts with the “maximum length 44” in the published ROA, and 2001:db8:2000::/44 is rejected because the origin autonomous system number 64502 conflicts with the “64503 origin” in the published ROA.
In this article, we looked at how to run an RPKI Validator and RTR cache server locally on the device’s loopback interface to enable RPKI filtering on BGP peers. Running Routinator on the local control plane was possible since EOS gives network operators the flexibility to tailor their Arista devices to best meet their needs using extendible features such as Docker containers.
By running the Routinator container from local persistent storage, in the event of a reload EOS will still be able to perform RPKI validation based on the local RPKI database until network connectivity is restored and Routinator is able to update its ROA cache and make that information available to EOS using the RTR protocol.
For the sake of simplicity, we did not explore further possible enhancements to this deployment, such as using Linux control groups to enforce resource limits on the Routinator container or sharing RTR servers between multiple routers on the network. How RPKI makes the best sense to deploy in your network will depend on the specifics of your network and requirements, so use this article as an example of only one option on how to deploy RPKI validation.