Aggregate VoQ drops on 7280/7500 devices
On 7280/7500 devices, the platform architecture uses Virtual Output Queuing (VoQ) between the ingress and egress chips to forward known unicast traffic.
Whenever a packet is to be transmitted, the ingress chip requests for credit from the egress. Once the credits are issued/granted, the packet is dequeued to the egress chip. While the packets are awaiting the credit, they are enqueued on the ingress chip buffers, in the Virtual Output Queue (VoQ) for the corresponding egress port.
Accordingly, in the output of “show interfaces counters queue detail” on these devices, we see two sections:
switch#show interfaces counters queue detail Aggregate VoQ Counters Egress Traffic Pkts Octets DropPkts DropOctets Port Class Et3/1/1 TC0 0 0 0 0 Et3/1/1 TC1 0 0 0 0 … …… Egress Queue Counters Port Traffic DropPrec DestType OutEnqPkts OutEnqOctets OutDropPkts OutDropOctets Class Et4/1/1 TC0 DP0-3 UC 0 0 0 0 ……
Drops seen in the “Aggregate VoQ Counters” are at the ingress chip VoQ for the respective egress port. Drops seen in the “Egress Queue Counters” take into account packets dropped at the egress chip for the respective egress port.
For known unicast traffic, in the event of congestion, we typically expect to see drops on the VoQ buffers. For troubleshooting congestion on these devices, please refer to the following article:
Egress Queue drops on 7280/7500 devices
On most 7280/7500 devices, by default, we use fabric/egress replication mode for forwarding the BUM (Broadcast, Unknown unicast, Multicast) traffic. In this mode, the credit mechanism for egress VoQs is not used. As the VoQ architecture is not used and we directly enqueue the packets to egress chip, any drops for BUM traffic will be counted in the “Egress Queue Counters” only. As such, in the event of high amounts of BUM traffic, it is more likely to cause egress queue contention, leading to potential packet loss.
switch#show interfaces counters queue detail Aggregate VoQ Counters Egress Traffic Pkts Octets DropPkts DropOctets Port Class Et5 TC6 153 155720 0 0 Et5 TC7 4113 220989 0 0 Egress Queue Counters Port Traffic DropPrec DestType OutEnqPkts OutEnqOctets OutDropPkts OutDropOctets Class Et5 TC1 DP0-3 MC 664484064 909472606828 171704 235014904 Et5 TC7 DP0-3 UC 4120 221353 0 0
In the event of reported egress drops, the most likely culprit for the same could be BUM traffic.
In such cases, if the amount of BUM traffic (such as multicast) is expected, one option is to change the replication mode to “ingress only”. As opposed to egress replication, where we rely on the fabric and egress chip to replicate the packet, Ingress only replication creates multiple copies of the packet for every interested egress port member on the ingress chip itself. The traffic is then enqueued on each VoQ of the individual egress ports, and sent over the fabric to the egress chip using the credit mechanism, once the requested credit is granted. This ensures that we do not overwhelm the egress chips with the traffic they are unable to handle. With ingress-only replication enabled, congestion drops for BUM traffic will also be reported on the Aggregate VoQ counters similar to known unicast traffic.
To change the replication mode on the switch to ingress only, you can use the following command:
switch(config)#platform sand multicast replication default ingress
To revert back to egress replication, we can use the following command:
switch(config)#platform sand multicast replication default fabric-egress
We can validate the current replication mode on the switch using the following command:
(Note that “20” is the VLAN ID in this scenario)
switch...23:19:26#show platform fap multicast-chain 20 Jericho0.0 Configured global replication mode: ingress only (default) MulticastId: 20 current replication mode: ingress only Using unicast buffers Ingress replication enabled in IRR_IRDB Ingress Membership from: Ingress chain in IRR_MCDB mcId Type QueueId/ IntfName outlif SysPortId 20 SysPort 91 Ethernet9 20 Egress Membership from: Egress chain is empty … switch...23:21:44(config)#show platform fap multicast-chain 20 Jericho0.0 Configured global replication mode: fabric/egress MulticastId: 20 current replication mode: fabric/egress Port-channel load-balance: disabled Using unicast buffers Ingress replication enabled in IRR_IRDB Ingress Membership from: Ingress chain in IRR_MCDB mcId Type QueueId/ IntfName outlif SysPortId 20 Queue 0 20 Mesh Replication Membership from: Fabric bitmap in FDT_IPT_MESH_MC mcId Type 20 Core1 Egress Membership from: Egress chain is empty …
Even after changing the replication mode, if you continue to see the Egress Queue drops increment, please get in touch with Arista Support for further investigation.