• Pause – Revisit the Fundamentals – ARP

 
 
Print Friendly, PDF & Email

Introduction

Wow, networking technology really does continue to march along. If you wanted to be a lifelong learner you definitely picked a great speciality. And face it, we all know the cool kids are the Network Engineers.

In this article we’re not going to take a bunch of packet captures nor analyze the outputs of a dozen ‘show’ commands. There are plenty of documents for that already. Rather, this document and the entire Pause series, looks to take a step back and feed your team’s banter about ‘What problem are we trying to solve?’

Evolving Tech

New Layer 1 technologies over the last couple of years include more 100GbE formats, 25GbE and 400GbE. With 800GbE underway and more to come. Those are just at the Physical layer. As we work up the stack we hit newer technologies at every level. How’s your VXLAN rollout coming along? Have you moved on to EVPN VXLAN? Let us not forget all of that automation we should be including in our design considerations.

Studying these newer features through lab testing I came to a realization. Recognizing that we often need to revisit the fundamentals. You may need to go back to the beginning to understand these shiny new things in a way that ensures successful implementation and operation. “What was the intent of the original underlying technologies?” “What problem were they trying to solve?” And “What problem am I trying to solve now?”

Overlays

Let’s take Overlays as an example. When deploying EVPN VXLAN one of the topics that frequently comes up is, “What happens with BUM traffic?” Those Broadcast, Unknown Unicast and Multicast frames and packets?” We find ourselves revisiting the basic behaviors of a network switch. And don’t forget about the old roles of the dedicated router. During that exercise we end up reviewing the mechanisms of protocols that we have long taken for granted. An example is ARP (Address Resolution Protocol). How do we handle ARP Requests and ARP Replies in this EVPN VXLAN world?

Start with the RFCs

Flashback to the ‘80s. Digging through the IETF archives you find yourself landing on RFC 826¹. This RFC dates back to 1982. I particularly like the way RFC 826 starts out the Problem Statement section. “The world is a jungle in general, and the networking game contributes many animals.” With each new way to solve old problems I wonder if we’re inviting new animals to the party or putting costumes on the guests all ready in attendance. RFC 826 has since been updated by RFC 5227 (2008) and RFC 5494 (2009). See, even the RFCs aren’t static.

Reading beyond the RFCs

The RFCs are a great starting point. Supplemental reading is helpful if you find the right authoritative resources. So moving beyond the IETF, I also cracked open my old copy of TCP/IP Illustrated, Volume 1 by W. Richard Stevens². As well as, to quote Alice in Chains, ‘a new friend turned me on to an old favorite’, The TCP/IP Guide, by Charles M. Kozierok³.

ARP

What problem does ARP solve? In cases where we know the Layer 3 IP address, the OG of ‘virtual addresses’, we need to map that IP address to the physical hardware address. When we’ve resolved the IP address of the destination, often via DNS, our end point now needs to determine if that IP address is on the same subnet or a different subnet. If a different subnet then our Default Gateway is used as the routing device. The router then needs an entry in one of its tables that tells the router how to reach the IP address if not on-net local to the router or that IP address X.X.X.X maps to the 48-bit physical address AA:BB:CC:DD:EE:FF if the router has that destination subnet attached. And if the destination IP is on a common subnet as the source IP then the source device checks to see if it already knows the IP to MAC address mapping in a local cache. If not, a broadcast frame, all F’s, is a shout out to all members on that subnet inquiring ‘Who owns X.X.X.X? Please tell <source>.’

On the Wire

OK – maybe one packet capture. Remember, on Arista switches you can quickly check the Control Plane traffic on an interface by using tcpdump. Here’s an example:

hostsw5.22:30:04#bash tcpdump -i vlan55 arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vlan55, link-type EN10MB (Ethernet), capture size 262144 bytes
22:30:38.804691 50:00:00:f6:ad:37 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 172.16.55.1 tell 172.16.55.2, length 28
22:30:39.831186 50:00:00:f6:ad:37 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 172.16.55.1 tell 172.16.55.2, length 28
22:30:40.855171 50:00:00:f6:ad:37 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 42: Request who-has 172.16.55.1 tell 172.16.55.2, length 28

The ARP Request is shouted to all members of the VLAN. It is one thing to have all members of that VLAN residing on the same switch. But what if that VLAN resides on multiple switches? And more, what if those other switches hosting that same VLAN are not Layer 2 adjacent to us? In other words, we don’t have an access or trunk port to reach those other switches. Instead, we have a Layer 3 boundary in between us. How do we overcome this Layer 3 boundary that was introduced in Overlay technologies such as EVPN VXLAN?

Answer; we make the whole Overlay mimic the behavior of a single switch. We wrap up that ARP broadcast and we flood it to the other VXLAN VTEPs (Virtual Tunnel Endpoints) that have informed us either statically through a manual VTEP Flood List or dynamically via the EVPN VXLAN Type 3 IMET routes.

Oh, wait. We’ve been referring to ARP and IPv4. What about IPv6? What if we want to run IPv6 in the Overlay? Here again we need to understand the native behavior of IPv6’s discovery mechanism. Now we’re dealing with ND (Neighbor Discovery). We’re going to see ND Neighbor Solicitation messages. So extend our homework fun to include digging into the differences of ARP’s broadcast method and IPv6’s multicast with the Solicited Node multicast address. Can and how does the Overlay deal with this scenario?

2nd Level Questions

One 2nd level question to ask now is ‘How much of this behavior can our network design tolerate?’ Or written another way; ‘What is the scalability of this behavior, and if bound, can it be optimized?’. Another 2nd level question to ask is ‘How can I see, verify and if needed, troubleshoot these functions in this new world?’

Now we’re really getting somewhere. In order to tweak something you need to understand the fundamentals of that which you desire to optimize. Understanding the reason and formatting of our ARP Requests and ARP Replies can allow us to tune for maximum performance and scale.

What else is part of that 2nd level questioning? How about the concept of an ARP cache? And if we cache, how do we know when we should flush that cache? Because part of a solution like EVPN VXLAN also needs to solve for VM Mobility. The destination was ‘here’ on switch1 in Data Center 1, but now we need to know that it has moved to ‘there’ on switch 15 in Data Center 2. How do we ensure that information gets propagated in a timely and efficient manner throughout our topology?

This makes us think of convergence beyond just routing protocol updates. We’re seeking MAC address-table convergence too. Could we save overhead of some devices by allowing others to answer on behalf of the destination end host or last hop network device? Proxy anyone? Proxy ARP, Reverse ARP and Gratuitous ARP are examples of optimizations that also need to be accounted for in an Overlay solution. Are there new terms added in support of the existing tech? Answer; yes. Check out “ARP Suppression.” Now take the existing concept of Local ARP entries and layer in Remote ARP entries. If we’re looking for optimizations, could one of those be to ask; “Do I need to take up space in hardware if the Remote ARP entries learned are not actually part of a communication flow for the hosts directly connected to a particular VTEP?”

Summary

The habit of following the evolution of these technologies plays a vital role in ensuring we understand the basics. That we design, build and test for all scenarios and operate the network ecosystem as best we can. Understanding the fundamentals and subsequent optimizations is as much a portion of our due diligence as the homework of testing out the newer technologies. As you dig into these new options to solve both old and new problems remember to pause. Pause and revisit the fundamentals.

Continued Reading – Arista TOIs

EOS Configuration Guide
https://www.arista.com/en/support/product-documentation

EVPN VXLAN Webinar Series, Session 2, Optimizing ARP in EVPN VXLAN

Troubleshooting Unknown Unicast Flooding
https://eos.arista.com/troubleshooting-unknown-unicast-flooding/

Troubleshooting Based on Control Plane Policing (CoPP) for Sand Platforms
https://eos.arista.com/troubleshooting-based-on-control-plane-policing-copp-for-sand-platform/

Centralized vs. Distributed VXLAN Routing with EVPN
https://eos.arista.com/centralized-vs-distributed-vxlan-routing-with-evpn/

VXLAN Troubleshooting Guide
https://eos.arista.com/vxlan-troubleshooting-guide/

TOI 4.23.2F
https://eos.arista.com/eos-4-23-2f/evpn-centralized-anycast-gateway/

TOI 4.23.2F
https://eos.arista.com/eos-4-23-2f/evpn-vxlan-all-active-multi-homing-integrated-routing-and-bridging/

TOI 4.23.1F
https://eos.arista.com/eos-4-23-1f/evpn-vxlan-ipv6-overlay/

TOI 4.21.3F
https://eos.arista.com/eos-4-21-3f/provide-user-control-of-selective-arp/

TOI 4.23.0F
https://eos.arista.com/eos-4-23-0f/evpn-vxlan-ipv6-overlay-toi/

TOI 4.22.1F
https://eos.arista.com/eos-4-22-1f/inter-vrf-local-connected-route-leaking/

TOI 4.21.6F
https://eos.arista.com/troubleshooting-evpn-irb-vxlan/

TOI 4.21.1F
https://eos.arista.com/eos-4-21-1f/ipv6-and-vrf-support-for-arp-converted-host-routes-injection-into-bgp/

TOI 4.20.1F
https://eos.arista.com/eos-4-20-1f/hostinject/

TOI 4.20.1F
https://eos.arista.com/eos-4-20-1f/expanded-vrrp-varp-virtual-mac-capabilities/

Remove ARP from L3 when MAC L2 port is down
https://eos.arista.com/remove-arp-from-l3-when-mac-l2-port-is-down/

ARP Reply Relay for VXLAN L3 Data Center Interconnect
https://eos.arista.com/eos-4-18-0f/arp-relay-vxlan-dci/

ARP Replies in a VXLAN Plus Routing Data Center Interconnect Deployment
https://eos.arista.com/arp-replies-in-a-vxlan-plus-routing-data-center-inter-connect-deployment/

Static ARP Inspection
https://eos.arista.com/eos-4-15-0f/static-arp-inspection/

 

Continued Reading – Other References

¹ An Ethernet Address Resolution Protocol, IETF RFC 826 (https://tools.ietf.org/html/rfc826)

² TCP/IP Illustrated, Volume 1 by W. Richard Stevens, copyright 1994

³ The TCP/IP Guide, by Charles M. Kozierok, copyright 2005

IPv4 Address Conflict Detection, IETF RFC 5227 (https://tools.ietf.org/html/rfc5227)

IANA Allocation Guidelines for the Address Resolution Protocol, IETF RFC 5494 (https://tools.ietf.org/html/rfc5494)

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: