• Category : Tech Tips

 
 

Resilient load-sharing using Nexthop Groups

Introduction Load-sharing of traffic flows towards a specific prefix in a L3 topology is usually achieved with Equal-Cost Multi-Path (ECMP) routing. With ECMP, multiple nexthops of equal preference are available for the prefix. Traffic is distributed towards the different next-hops based on a hashing algorithm and packets belonging to the same traffic flow are by default hashed to the same nexthop. A problem with ECMP is that if one of the nexthops is removed all flows are affected as a new hash should be calculated for all flows based on the remaining active nexthops. This can be remediated by using...
Continue reading →

EVE-NG , Arista vEOS-Lab and MTU 9214 problems

vEOS-lab in MLAG wouldn’t peer when MTU was 9214 EVE-NG (https://www.eve-ng.net) community edition is a free network emulator to test out network configurations and topologies in lab environment. As with all emulators, not all real hardware configurations work on emulators. One of them is MTU 9214. Problem : vEOS-lab in MLAG wouldn’t peer when MTU was 9214 on the MLAG peer link as shown below leaf01#sh run int vl4094 interface Vlan4094 mtu 9214 no autostate ip address 94.0.0.0/31 leaf01# Solution: The moment we removed the mtu 9214 from the int vl4094, the Peers were online. leaf01#sh run int vl4094 interface...
Continue reading →

ZTP with Arista Switches

Overview This article is intended to discuss zero to one of ZTP. Introduction Zero Touch Provisioning(ZTP) is a feature that allows users to initially provision Arista network switches without user interaction. The switch enters ZTP mode whenever the device comes up without a start-up configuration in flash. It remains in ZTP mode until a user cancels ZTP mode, or until the switch retrieves a startup-config or a boot script. After downloading a file through ZTP, the switch reboots again, using the startup configurations from the retrieved file. To provision the switch through Zero Touch Provisioning Step 1: Mount the switch...
Continue reading →

Replacement of MLAG peer switch

Objective The aim of this document is to describe the procedure assisting in the physical replacement of one of the MLAG devices (for example in RMA scenarios where one device has a hardware fault and requires a replacement with a new one). Minimal traffic disruption during the time of this replacement is desired from an MLAG setup due to peer redundancy. It would not ensure zero loss as during the replacement, loss of in flight packets (going towards the switch being replaced) is inevitable. Introduction MLAG by Arista is a method to provide an active-active device level redundancy in the...
Continue reading →

Buffer tuning for output discard mitigation

Platform : 7050 Series,  7060, 7260 Series and 7304 Series. This document explains how to mitigate the output discards that are caused on the following platforms, due to congestion. How to determine if the drops are due to congestion : If the rate of traffic is nearing / exceeding the link bandwidth, it is pretty easy to understand that the drops are happening due to congestion. If the traffic rate is well within the interface bandwidth and still the drops are happening, they might be due to microbursts. ( Microbursts are huge spikes in traffic rate, which happen and end...
Continue reading →

PTP Best Master Clock Algorithm (BMCA)

Scope This article describes the “Best Master Clock Algorithm”(BMCA) and the manner in which it’s carried out on Arista switches. BMCA BMCA is used for selecting a Grandmaster (GM) in a PTP domain. Additionally, it is also used to decide the PTP port-states on the Arista switches. PTP port-states Master It provides timing to a downstream clock. PTP master ports send out announce messages. Slave It retrieves timing from an upstream clock. In this state, the port doesn’t send out announce messages. Passive A backup slave port. This state prevents timing loops. There are no announce messages sent out of...
Continue reading →

Routing Context – Management VRF and Logs backup

I. Overview The following article describes the functionality of routing context mode and how to use the functionality to export the logs and files from the device to the Desktop machine or to the backup/storage server. II. Introduction In most of the networking infrastructure, the networking devices are being administered or accessed in non-default vrf (Management VRF). The non-default will have access to the Network orchestration tools, Backup servers, Desktop machines depending upon the network infrastructure policy. Network Administrators / Orchestrators will use the Management vrf plane to communicate with the Networking devices.  The routing-context mode in EOS CLI makes...
Continue reading →

Troubleshooting EVPN IRB with VXLAN

Overview This article provides a brief introduction to EVPN IRB with VXLAN along with basic debugging methods for the same. Introduction Ethernet VPN (EVPN) is an extension of the MP-BGP protocol introducing a new address family. EVPN is used as a control-plane for VXLAN environments to exchange information such as MAC addresses and ARP bindings along with VTEP flood list. Additionally,  IP prefixes can be exchanged in the overlay using Type-5 routes.  Platform Compatibility The below table captures the EVPN IRB support for a few Arista platforms: on the platforms listed below:   Platform Feature Support EOS Release 7050X/ 7300X/...
Continue reading →

Dot1q tagged LACPDU

Introduction This document provides details on how 802.1Q tagged LACP packets are handled on our Arista device. 802.1Q tagged LACP PDUs The LACP PDU frames were ingressing from other vendors into an Arista switch with an 802.1Q tag and designated as VLAN 0. 10:03:58.521076 58:ac:78:f2:8c:05 > 01:80:c2:00:00:02, ethertype 802.1Q (0x8100), length 128: vlan 0, p 0, ethertype Slow Protocols, LACPv1, length 110 10:03:59.421028 58:ac:78:f2:8c:05 > 01:80:c2:00:00:02, ethertype 802.1Q (0x8100), length 128: vlan 0, p 0, ethertype Slow Protocols, LACPv1, length 110 Natively, EOS discards tagged LACP PDUs as they are out of spec. These discards can be observed using the...
Continue reading →

LACP Rate Fast

Introduction The LACP rate fast feature is used to set the rate (once every second) at which the LACP control packets are sent from partner. The normal rate at which LACP packets are sent is 30 seconds. This document provides workflow of the LACP rate fast feature including the packet capture and some recommendations/concerns in MLAG setup. How it works When LACP is synchronizing between two device LACP PDUs are sent at a rate of 1 per second until both sides are synchronized. Once this is complete they are sent at a rate of 1 per 30 seconds. LACP Rate...
Continue reading →

Troubleshooting sFlow

Overview This document aims at providing the basic checks that can be performed for troubleshooting sFlow. Introduction Arista switches provide an sFlow agent that samples only ingress traffic from all Ethernet and port-channel interfaces. This agent combines the interface counters and flow samples into sFlow datagrams that are sent to a sFlow collector. A sFlow collector is a server that runs software which analyzes and reports network traffic. Arista switches do not include sFlow collector software. The switch sends sFlow datagrams to the collector located at an IP address specified by a global configuration command. If the collector destination is...
Continue reading →

Console Troubleshooting Guide

Objective The objective of this document is to outline the common issues faced while using a console cable/server to access an Arista Switch. This document lists the troubleshooting steps to isolate the issue with these connections. Introduction In order to access the device, we use either an SSH or a Console connection. Normally, the console port is used for serial access to the switch and is used in the following cases: • initial provisioning of the device manually (when the management ports are not assigned IP addresses) • the device is inaccessible remotely via SSH Please refer to the appropriate...
Continue reading →

Supervisor replacement procedure

Objective The aim of this document is to outline the procedure when replacing a supervisor in a modular chassis. Currently, Arista chassis switch support up to two supervisors Note: Please proceed with the replacement of the supervisor modules as instructed in this document only during a maintenance window. In case there is redundancy in place for the device, traffic needs to be diverted to the redundant device while the replacement procedure is followed   We are going to cover Scenario 1. Chassis has only one supervisor, redundant supervisor is not present 1. The active supervisor is partially functional and needs...
Continue reading →

Understanding subscription paths for Open-source Telemetry streaming

Introduction   The purpose of this document is to understand how the subscription paths are constructed for our openconfig connector apps (ocprometheus, ockafka, octsdb, etc.) that communicate with TerminAttr and send telemetry data to 3rd party Telemetry backends (Kafka, Prometheus, TSDB, Redis, Graphite, etc.) All our OpenConfig connectors are publicly available and can be found on the goarista github repo: https://github.com/aristanetworks/goarista/tree/master/cmd Most of these OpenConfig connectors use a yaml or json file which contains the paths it is supposed to subscribe to. ocprometheus octsdb Others like ockafka, ocredis don’t support paths from a file, so you have to enumerate the...
Continue reading →

Streaming EOS telemetry states to Prometheus

Introduction Prometheus is one of the most popular open-source monitoring and alerting systems, which scrapes and stores numeric time series data over HTTP. It has a very flexible query language, can send alerts via alertmanager to various platform and can be integrated easily with many open-source tools. For more details and use cases, please visit https://prometheus.io/docs/introduction/overview/ The purpose of this article is to show how easy it is to deploy and configure Prometheus and Grafana and configure Arista switches to send telemetry states to Prometheus using TerminAttr ( EOS streaming telemetry agent ) and one of the OpenConfig connectors that...
Continue reading →

PTP slave-passive port election

Scope 1. This article takes account of how the slave-passive port election for PTP is done on Arista switches. Slave-Passive port election order The below sequence of comparison occurs in order to decide if a port should take slave or passive state: 1. Steps removed 2. Parent clock identity 3. Self Port-ID Steps removed “Steps removed” is the number of hops separating a PTP clock from the GM. The port that has a lower “steps removed” value is preferred as a slave port. The grandmaster in a PTP domain has a “steps removed” value equivalent to zero. Every subsequent PTP running port in...
Continue reading →

cEOS-lab in GNS3

GNS3 is a great tool to visualize your (home-)lab environment and simulate all kinds of network topologies using different virtualization and isolation technologies. It has been widely used to create environments using vEOS-lab, but because vEOS-lab requires quite some resources (e.g. 2GB of RAM is required) the scale of these labs was often quite limited, especially on low-memory devices. Arista’s cEOS-lab is a new way of packaging the EOS-lab suite. Using the Docker container daemon, it is possible to use the kernel of the host machine and to only run the EOS processes that are required on the machine, making...
Continue reading →

How to FTP/SCP/WinSCP

In this document we will look at tools for quickly uploading and downloading files between hosts and Arista switches. 1) SCP On a Linux or Mac, scp is a CLI tool already built in and can be invoked by using the scp command. SCP or secure copy allows secure transferring of files between a local host and a remote host or between two remote hosts. It uses the same authentication and security as the Secure Shell (SSH) protocol from which it is based. Before we look at the commands and examples, please make sure steps given below are followed:  ...
Continue reading →

“Wait-for-warmup” command – To understand if an agent has initialized

Objective The aim of this document is to convey the use case and details of the bash command, wait-for-warmup (wfw). An equivalent CLI command exists for the same which is described later in this article. Main use case Agents like the forwarding agent of the switch take some time to come up when terminated. The same is the case for the linecard and fabric module agents when the fabric modules/linecards are power-cycled. It’d be useful to run the wfw command to check if the agent has initialized completely instead of running other commands repeatedly to check the same. (If interfaces...
Continue reading →

Taking packet captures on Arista devices

Control-plane packet capture TCPDUMP on physical ports and SVIs. This will help in capturing only control plane traffic but no data plane traffic. Running tcpdump natively in EOS #tcpdump interface Management1 filter ether proto 0x88cc tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ma1, link-type EN10MB (Ethernet), capture size 65535 bytes 11:33:47.750573 00:1c:73:00:44:d5 (oui Arista Networks) > 01:80:c2:00:00:0e (oui Unknown), ethertype LLDP (0x88cc), length 187: LLDP, length 173: s7151.lab.local Running tcpdump from bash bash ifconfig et1       Link encap:Ethernet  HWaddr 00:1C:73:00:44:D6         UP BROADCAST MULTICAST  MTU:9214 Metric:1         RX packets:0 errors:0 dropped:0 overruns:0 frame:0         TX packets:0 errors:0...
Continue reading →

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: