• Blog

 
 

Console Troubleshooting Guide

ContentsObjectiveIntroductionInitial Manual Provisioning:Console is not working post deployment:SSH and Console both are not accessible: Objective The objective of this document is to outline the common issues faced while using a console cable/server to access an Arista Switch. This document lists the troubleshooting steps to isolate the issue with these connections. Introduction In order to access the device, we use either an SSH or a Console connection. Normally, the console port is used for serial access to the switch and is used in the following cases: • initial provisioning of the device manually (when the management ports are not assigned IP...
Continue reading →

Supervisor replacement procedure

ContentsObjectiveVerification prior supervisor replacementRedundancy Protocol: Stateful Switchover (SSO)Redundancy Protocol: Route Processor Redundancy (RPR)Steps to be followed during replacementScenario 1. Chassis has only one supervisor, redundant supervisor is not present1. The active supervisor is partially functional and needs a replacement2. Switch is unresponsive via management or console, forwarding across the chassis is brokenScenario 2. The chassis has two supervisor modules1. Active supervisor is malfunctioning and needs to be replaced2. The standby supervisor is malfunctioning and needs to be replaced Objective The aim of this document is to outline the procedure when replacing a supervisor in a modular chassis. Currently, Arista chassis...
Continue reading →

Understanding subscription paths for Open-source Telemetry streaming

ContentsIntroductionChecking the paths1) Using TerminAttr2) Using the Telemetry browser in CVPMetric ExplorerMetric path structuringInterfaces countersDOM values QSFP RX Power on Modular boxesHow it looks like on the server sideExample 1 – Rx Power for QSFP on Fixed systemsExample 2 – Rx/Tx Power for SFPs on Fixed SystemsExample 3 – interface countersGraph examples on Grafana Data streamed from octsdb to GraphiteData streamed from ocprometheus to PrometheusSample ConfigsTerminAttrOcprometheusOckafkaOctsdbUseful Links: Introduction   The purpose of this document is to understand how the subscription paths are constructed for our openconfig connector apps (ocprometheus, ockafka, octsdb, etc.) that communicate with TerminAttr and send telemetry data to 3rd...
Continue reading →

Streaming EOS telemetry states to Prometheus

ContentsIntroductionPrerequisitesInstalling Prometheus and GrafanaInstalling PrometheusAdd new targetsOption 1:Option 2Option 3Installing GrafanaInstalling and configuring ocprometheusGit Clone the Arista GO libraryCompile ocprometheus in GOOption 1) Using the binaryOption 2) Install it as a swixFlags cheat sheetSample EOS Configuration No VRF and streaming to both CVP and PrometheusNo VRF and only PrometheusVRF management and streaming to both CVP and PrometheusVRF and only PrometheusVRF and only Prometheus using authenticationPDP: VRF and only PrometheusCreating dashboards in GrafanaHow to graph data only for specific interface?Pre-defined dashboardUsing rule records in Prometheus for EOS pathsOcprometheus.yml on EOSPrometheus.yml on the serverKernel_rules.yml on the serverTroubleshooting tipsChecking Logs and troubleshootingCommon issuesContext_deadline_exceededNot seeing...
Continue reading →

Resilient load-sharing using Nexthop Groups

ContentsIntroductionNexthop GroupsResiliency in Nexthop GroupsEven flow distributionAdding NexthopsCombining Addition of Nexthops and Even Flow DistributionHashingMuti-switch designAdvertising VIP addressSummary Introduction Load-sharing of traffic flows towards a specific prefix in a L3 topology is usually achieved with Equal-Cost Multi-Path (ECMP) routing. With ECMP, multiple nexthops of equal preference are available for the prefix. Traffic is distributed towards the different next-hops based on a hashing algorithm and packets belonging to the same traffic flow are by default hashed to the same nexthop. A problem with ECMP is that if one of the nexthops is removed all flows are affected as a new hash...
Continue reading →

PTP slave-passive port election

ContentsScopeSlave-Passive port election orderSteps removedParent clock identitySelf Port-IDExampleTopologyConfigurationsOutputsElection based on the “steps removed” valueTopologyObservationsElection based on the parent clock identityTopologyObservationsElection based on the self port-IDTopologyObservations Scope 1. This article takes account of how the slave-passive port election for PTP is done on Arista switches. Slave-Passive port election order The below sequence of comparison occurs in order to decide if a port should take slave or passive state: 1. Steps removed 2. Parent clock identity 3. Self Port-ID Steps removed “Steps removed” is the number of hops separating a PTP clock from the GM. The port that has a lower “steps removed”...
Continue reading →

cEOS-lab in GNS3

GNS3 is a great tool to visualize your (home-)lab environment and simulate all kinds of network topologies using different virtualization and isolation technologies. It has been widely used to create environments using vEOS-lab, but because vEOS-lab requires quite some resources (e.g. 2GB of RAM is required) the scale of these labs was often quite limited, especially on low-memory devices. Arista’s cEOS-lab is a new way of packaging the EOS-lab suite. Using the Docker container daemon, it is possible to use the kernel of the host machine and to only run the EOS processes that are required on the machine, making...
Continue reading →

How to FTP/SCP/WinSCP

In this document we will look at tools for quickly uploading and downloading files between hosts and Arista switches. 1) SCP On a Linux or Mac, scp is a CLI tool already built in and can be invoked by using the scp command. SCP or secure copy allows secure transferring of files between a local host and a remote host or between two remote hosts. It uses the same authentication and security as the Secure Shell (SSH) protocol from which it is based. Before we look at the commands and examples, please make sure steps given below are followed:  ...
Continue reading →

“Wait-for-warmup” command – To understand if an agent has initialized

ContentsObjectiveMain use caseDetails of the wfw commandExamplesa) The agent is up and runningb) The agent is shutdownWFW command without the verbose optionEquivalent CLI command for wfwWFW command behavior after terminating an EOS agent Objective The aim of this document is to convey the use case and details of the bash command, wait-for-warmup (wfw). An equivalent CLI command exists for the same which is described later in this article. Main use case Agents like the forwarding agent of the switch take some time to come up when terminated. The same is the case for the linecard and fabric module agents when...
Continue reading →

Taking packet captures on Arista devices

ContentsControl-plane packet captureRunning tcpdump natively in EOSRunning tcpdump from bashData-plane packet captureMonitor sessionRunning tcpdump for data-plane traffic natively in EOSRunning tcpdump for data-plane traffic from bashRunning tcpdump for data-plane traffic from bash for 7050/7060/7260 devices using mirror to GRETest setupConfigurationShow commandsViewing the packet capture on CLIViewing the packet capture on wireshark:To enable sflow globallyConfiguring the agent source addressConfiguring the polling intervalConfiguring the sampling rate and sample contentsEnabling sflowShow commandsRunning tcpdump for data-plane traffic from bash for 7050/7060 devices using sflowConfigurationShow commandsViewing the packet capture on CLILimitationsReferences Control-plane packet capture TCPDUMP on physical ports and SVIs. This will help in capturing...
Continue reading →

Troubleshooting Multicast

ContentsOverviewPreliminary ChecksCommon Issues in Multicast1) Multicast Bridging(i) Multicast traffic not received by the subscriber(ii)Multicast traffic is flooded within a VLAN2) Multicast Routing(i) Last-Hop Router (LHR)(a) (*,G) mroute is missing (b) OIL is not populated for a (*,G)(ii) First-Hop Router (FHR)(a) (S,G) mroute is missing or the IIF for (S,G) has null populated(b) OIL is stuck in Register on FHR:(iii) Rendez-vous Point (RP)(a) (*,G) mroute is missing(b) (S,G) mroute is missing(iv) Unresolved Mroutes: 3) Is Multicast traffic being software forwarded i.e. is traffic going to CPU? Logs Collection: Overview The aim of this article is to highlight common issues related to...
Continue reading →

VxLAN troubleshooting guide

ContentsVxLAN Basic Troubleshooting GuideI. ObjectiveII. Introduction:III. TopologyIV. Generic Configurations to be checkedIV. Scenario specific troubleshootingVxLAN Bridging:VxLAN Routing: VxLAN Basic Troubleshooting Guide I. Objective Provide basic/generic troubleshooting steps to customers in case any VxLAN issue is encountered in their network. II. Introduction: Troubleshooting VxLAN involves few steps as mentioned in the upcoming sections of this document. The below referred topology includes VxLAN configurations with server 1,2,3 as the host devices which obtain connectivity over a vxlan tunnel. Troubleshooting steps are bifurcated into routing and bridging to include multiple scenarios possible.   III. Topology   IV. Generic Configurations to be checked A....
Continue reading →

Basic BGP Troubleshooting

ContentsObjectiveI. Neighborshipa. Idle (NoIf)b. Idle(MaxPath)c. The neighborship state is flapping between Connect and Active:d. Stuck in Active stateII. Route Advertisement/Receptiona. Route reception issueb. Route advertisement issuec. AS path loopIII. Route InstallationCase 1: The prefix received from one peer is preferred over the same prefix received from another peer.Case 2: Route for the prefix is installed from a routing protocol other than BGPCase 3: No route to the next hopVI. Logs collection: Objective The objective of this document is to outline the various common issues faced in BGP and the troubleshooting commands for the same. I. Neighborship BGP sends unicast messages,...
Continue reading →

Centralized vs. Distributed VxLAN Routing with EVPN

Tech Note: Centralized vs. Distributed VxLAN Routing with EVPN Over the past few years EVPN VxLAN deployments have become an increasingly popular overlay architecture selected by customers, primarily in data-center layer 3 leaf-spine (L3LS) fabrics.  With this popularity, numerous deployment topologies, and configuration options have presented themselves. This article reflects our observations based on real-world deployment experiences on one such choice; centralized vs. distributed gateways. When deploying EVPN VXLAN integrated routing and bridging (IRB), both VXLAN bridging and VXLAN routing are required concurrently on the switch.  This capability is also commonly referred to as an EVPN VxLAN gateway. There are...
Continue reading →

Displaying Neighbors’ Names with OSPF and BGP

This article describes how to configure Arista devices to display user-defined names for OSPF and BGP neighbors. OSPF First define name to IP address mappings, one per neighbor, where IP address is neighbor’s OSPF router ID: SW1(config)# ip host SW2 2.2.2.2 Next enable OSPF name resolution: SW1(config)# ip ospf name-lookup Finally, validate the output of ‘show ip ospf neighbor’ command. The command should display the user-defined name instead of router-ID: SW1(config)# show ip ospf neighbor Neighbor ID   VRF         Pri       State             Dead Time     Address        Interface SW2   ...
Continue reading →

25G Lane Speed

ContentsIntroductionPlatform CompatibilityConfigurationConfigure forced 25G speedOutput from “show interface status” commandConfigure forced 10G/1G speed on Et2-4Output from “show interface status” commandConfigure forced 10G speedOutput from “show interface status” commandConfigure forced 1G/100M speed on Et2-4Output from “show interface status” commandConfigure forced 1G speedOutput from “show interface status” commandConfigure forced 10G/100M speed on Et2-4Output from “show interface status” commandStatus Introduction With the introduction of support for 25GbE on servers and switches we expect to see a rapid movement to server attachment at 25G, replacing the use of servers at 40G. Even though 25G is becoming norm these days, most of the deployment is...
Continue reading →

Basic troubleshooting steps for some CVP and telemetry issues

ContentsObjectiveGeneral issues covered1. The CVP web-explorer is not reachable2. A configlet/image bundle push task to the switch failed3. Device not getting added to telemetryLogs to be collected from the SwitchLogs to be collected from the CVP server Objective The aim of this document is to convey a set of troubleshooting steps that can be carried out when running into issues with CVP and telemetry. General issues covered Issue 1- The CVP web-explorer is not reachable Issue 2- A configlet/image bundle push task to the switch failed Issue 3- Device not getting added to telemetry 1. The CVP web-explorer is not...
Continue reading →

Password Recovery

This article describes how to gain access to an Arista 7130 device if you lose the password. ContentsThere are two solutions:Password recovery using grubMOS 0.20 and laterMOS 0.19.10 and earlierMOS 0.14.3 and earlierNo grub prompt from MOS 0.17.0 to 0.18.6Factory restore via USB There are two solutions: Password recovery using grub from the serial console. Factory restore via USB. Password recovery using grub Reboot, either by using the reload command at the command line or by power cycling the device. The grub menu will appear after the BIOS message “Press <del> or to enter setup” or “Press <del> or to...
Continue reading →

Interface Status

The “show interfaces status” commands show the link status of the receive (Rx) and transmit (Tx). Besides “up” and “down”, the command gives addition information of the status of the port. This includes: Link status: shutdown – the port has been shutdown through the management platform. (Tx only) no source – the interface is not sourcing signal from anywhere. (Rx only) no signal – there is no signal received. (Rx only) no link – a signal is detected on the line side but there is not a valid link coming into device from upstream. Flags returned from the underlying driver...
Continue reading →

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: