• Latency Analyzer (LANZ) Architectures and Configuration

 
 
Print Friendly, PDF & Email

Introduction

 

Arista Latency Analyzer, or LANZ, is a technology that tracks and logs buffer congestion and latency in real time.  The visibility provided by LANZ of network hot-spots and microburst oversubscription gives the network operator greater insight into when problems are occurring on the network and why.  With LANZ you will know when congestion happened, track the sources of congestion, and be able to export real-time events to external applications.  LANZ also shows the effect of packet buffering on an application as well as monitors and records packet drops during network congestion.  It is an invaluable tool which allows proactive monitoring and visibility into a network rather than the reactive approach of looking for dropped packets after slowness in the application or overall network has been reported.

 

LANZ operates by setting threshold values on the interface and global buffer pools and then generates records for the start and end events causing those threshold values to be exceeded.  Update records are also generated when buffer use exceeds those thresholds for a prolonged period of time.  Those records can then be seen through a series of show commands on the CLI, syslog events, and/or streamed off switch encoded in Google Protocol Buffer format.

.
Screen-Shot-2015-03-10-at-4.57.19-PM

This article is meant to highlight how to enable LANZ on Arista switches and to highlight the difference in LANZ functionality across different platforms.

 

1) Enabling Latency Analyzer

 

LANZ can be enabled on the switch with a single command:

 

# Enable LANZ globally

switch(config)#queue-monitor length

 

# Disable LANZ for interface Ethernet 1

switch(config-if-Et1)#no queue-monitor length

 

LANZ can be enabled for the global buffer on the 7150S switches with the following command:

 

# Enable LANZ for the global buffer

7150S(config)#queue-monitor length global buffer

 

The architectural differences between the 7150S line of switches and the 7500E/7280SE provide slightly different visibility.  In the 7150S, we have already discussed the ability to configure both a high and low threshold.  The 7150S is a shared memory switch meaning that there is a single pool of memory that is allocated to all interfaces to provide packet buffering. During the serialization of packets, or when multiple interfaces receive traffic and attempt to send traffic to the same egress port, queuing will begin to occur for that egress interface.  Please see the diagram below.

Screen Shot 2015-03-10 at 5.04.54 PM

The 7500E and 7280SE both utilize Virtual Output Queuing (VOQ).  VOQ uses input side queuing, where a virtual queue exists for every egress port, to effectively eliminate Head of Line Blocking (HOLB) on egress. This allows for packets to be queued at the ingress port and requires LANZ to monitor buffer depth at the ingress port as opposed to the egress port as seen in the diagram below:
Screen Shot 2015-03-10 at 5.09.23 PM

2) Setting LANZ Thresholds

 

The 7150S provides visibility into both individual interface buffers as well as the global buffer.  The packets buffered in a 7150 queue are held in a fixed segment size of 160 bytes.  LANZ buffer monitoring tracks these as 480 byte segments on the interface level.

 

# Update thresholds for the global buffer

7150S(config)#queue-monitor length global-buffer thresholds 1000 500

# Update thresholds for the interface buffers

7150S(config-if-Et1)#queue-monitor length thresholds 1000 500
7150S(config-if-Et2)#queue-monitor length thresholds 300 100

For a deeper understanding on how to fine tune thresholds see the EOS Central article LANZ Tuning

 

The 7500E and 7280SE both provide visibility into individual interface buffers only.  The packets buffered on these interface queues are measures in standard bytes on the interface level.

 

# Update thresholds for the interface buffers

7280SE(config-if-Et1)#queue-monitor length threshold 1000

 

3) Viewing LANZ Output

 

All platforms support the ability to see if LANZ is enabled or disabled, the current threshold levels, and other pertinent information for the device specific LANZ configuration.  You can see in the below output, interfaces Et1 and Et2 have the adjusted thresholds from the commands shown above while the remainder of the interfaces are set to default values.

 

# Viewing queue thresholds (7150S)

7150S#show queue-monitor length status
queue-monitor length enabled
queue-monitor length packet sampling is enabled
queue-monitor length update interval in micro seconds:  5000000
Mirror destination interface is Cpu
Global Buffer Monitoring
------------------------
Global buffer monitoring is enabled
Segment size in bytes :   160
Total buffers in segments : 36864
High threshold : 14415
Low threshold :  5766
 
Per-Interface Queue Length Monitoring
-------------------------------------
Queue length monitoring is enabled
Segment size in bytes :   480
Maximum queue length in segments :  4806
Port thresholds in segments:
Port     High threshold  Low threshold   Mirroring Enabled
Cpu         11792           11792            True
Et1          1000             500            True
Et2           300             100            True
Et3           512             256            True
Et4           512             256            True
Et5           512             256            True
Et6           512             256            True
Et7           512             256            True

-----truncated-----

 

 

# Viewing queue thresholds (7280SE/7500E)

7280SE(config-if-Et1)#show queue-monitor length status

queue-monitor length enabled

queue-monitor length packet sampling is disabled

Per-Interface Queue Length Monitoring

-------------------------------------

Queue length monitoring is enabled

Maximum queue length in bytes : 52428800

Port threshold in bytes:

Port     High threshold   Mirroring Enabled
Et1                1000               False
Et2             5242880               False
Et3             5242880               False
Et4             5242880               False
Et5             5242880               False
Et6             5242880               False
Et7             5242880               False
Et8             5242880               False
Et9             5242880               False
Et10            5242880               False
Et11            5242880               False
Et12            5242880               False
-----truncated-----

 

All platforms also support the ability to show LANZ events through the CLI or syslog.  By default, LANZ does not log events to syslog and must be configured with a time interval value between syslog entries.

 

# Viewing LANZ events through CLI(7150S)

7150S#show queue-monitor length

Report generated at 2015-03-10 22:57:04

E-End, U-Update, S-Start, TC-Traffic Class

GH-High, GU-Update, GL-Low

Segment size for E, U and S congestion records is 480 bytes

Segment size for GL, GU and GH congestion records is 160 bytes

* Max queue length during period of congestion

+ Period of congestion exceeded counter

--------------------------------------------------------------------------------

Type    Time                  Intf    Congestion     Queue       Time of Max
                              (TC)    duration       length      Queue length
                              (usecs)                (segments)  relative to
                                                                 congestion
                                                                 start(usecs)
--------------------------------------------------------------------------------
E   0:00:03.48675 ago         Et1(1)    29            2*          0
S   0:00:03.48678 ago         Et1(1)    N/A           2           N/A
E   0:00:03.49949 ago         Et1(1)    29            2*          0
S   0:00:03.49952 ago         Et1(1)    N/A           2           N/A
E   0:00:03.50384 ago         Et1(1)    29            2*          0
S   0:00:03.50387 ago         Et1(1)    N/A           2           N/A
E   0:00:03.50826 ago         Et1(1)    29            2*          0
S   0:00:03.50829 ago         Et1(1)    N/A           2           N/A
E   0:00:03.51763 ago         Et1(1)    29            2*          0
S   0:00:03.51766 ago         Et1(1)    N/A           2           N/A
E   0:00:03.53011 ago         Et1(1)    29            2*          0
S   0:00:03.53014 ago         Et1(1)    N/A           2           N/A           
-----truncated-----

 

# Viewing LANZ events through CLI(7280SE/7500E)

7150S#show queue-monitor length


Report generated at 2015-03-10 22:11:08

Time                          Interface  Queue     Duration  Traffic  Ingress
                                         Length              Class    Port-set
                                         (bytes)   (secs)

------------------------------------------------------------------------------------

0:03:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:04:08.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:04:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:05:07.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:05:38.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:06:07.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:06:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:07:08.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:07:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:08:07.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:08:38.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:09:07.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:09:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:10:08.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:10:37.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
0:11:07.06666 ago             Et50/1         272        1      7    Et25  -Et50/4
-----truncated-----

 

# Viewing LANZ events in syslog

switch(config)#queue-monitor length log 300

switch(config-if-Et2)#show log | grep threshold

Oct 27 12:48:22 switch QUEUE_MONITOR-6-LENGTH_OVER_THRESHOLD: Interface Ethernet1 queue length is over threshold of 512, current length is 1024

 

The 7150 platform provides additional capabilities of viewing queue drops, high threshold statistics, and additional latency added because of queue depth.  You are also able to generate a CSV report listing the most recent 100,000 events.

 

# Viewing more detailed LANZ events

7150S(config)#show queue-monitor length ?

     Ethernet       Ethernet interface
     all            Display all the congestion records
     cpu            Cpu port(s)
     csv            CSV format, with oldest samples first
     drops          Queue drops information
     global-buffer  Display buffer usage
     limit          Limit samples displayed
     statistics     high threshold counts
     status         Display status
     tx-latency     Display queue tx-delay
     >              Redirect output to URL
     >>             Append redirected output to URL
     |              Output modifiers

 

Additionally, the 7150 platform provides the ability to stream LANZ records to external devices via Google Protocol Buffers (GPB).  The below command starts the switch to listen on port 50001 for any GPB client that would try to connect to the switch and receive the records.

 

# Enabling LANZ Streaming

7150S(config)#queue-monitor streaming

7150S(config-qm-streaming)#no shutdown

 

4) LANZ Traffic Sampling

 

Additionally, the 7150 platform can be configured to automatically send traffic experiencing congestion to either the CPU or an egress interface once a queue threshold has been crossed.

 

# Enable LANZ mirroring

7150S(config)#queue-monitor length mirror

 

# Configure mirror destination

7150S(config)#queue-monitor length mirror destination ?
     Cpu  Cpu port(s)
     Ethernet  Ethernet interface

 

This can be useful to either export that congested traffic to a packet capture device or some other tool for analysis or directly to the CPU of the switch for immediate inspection.  To inspect the traffic on the switch itself use the following command:

 

7150S(config)#tcpdump queue-monitor

 

Alternatively you can use the bash shell to view the output as well:

 

7150S(config)#bash tcpdump -i lanz

 

The output below was generated using basic ping traffic, but you can see how the functionality can be used to obtain detailed visibility into buffered traffic on the switch itself or sent off to another capture device

 

7150S(config)#tcpdump queue-monitor

tcpdump: WARNING: lanz: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lanz, link-type EN10MB (Ethernet), capture size 65535 bytes

23:01:17.794281 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:17.991120 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:18.091730 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:18.599131 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:18.838424 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:19.745172 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:19.792002 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
23:01:19.906370 00:1c:73:00:44:d6 > 00:1c:73:74:32:7f, ethertype 802.1Q (0x8100), length 1138: vlan 1006, p 0, ethertype IPv4, 5.0.0.1 > 5.0.0.2: ip-proto-1
-----truncated-----

 

5) LANZ lite (7500 and 7048T)

 

A lightweight LANZ capability is also available on first generation 7500 modular and 7048 fixed form switches.  The granularity of the event polling is limited to a single event per second and just like on the 7500E/7280SE switches, only a single threshold is configurable and the queue is measured in bytes.

 

The configuration for LANZ is identical to other devices.

 

# Enable LANZ globally

7048(config)#queue-monitor length

 

# Update thresholds for the interface buffers

7048(config-if-Et1)#queue-monitor length threshold 1000

 

Due to limited hardware support on these platforms, it is not possible to monitor congestion events for all queues simultaneously as in other systems.  Only the largest congestion events can be found in part due to the less frequent polling cycles.  It should be noted however, that significant visibility is still added to the network, and congestion events in the network, with this functionality.

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: