• Monitoring Link Quality Using Forward Error Correction (FEC) Data on Arista Switches

 
 
Print Friendly, PDF & Email

Introduction

When forward error correction is enabled, it provides a set of statistics which can be used to monitor the health of the link at layer 1.  By comparing trends over time it may be possible to predict which links may experience service impacting error rates allowing action to be taken before these events. This document will describe these statistics and how to monitor them on an Arista switch running EOS.  

Forward Error Correction

Forward error correction (FEC) is a technique used in data communications where data is portioned into blocks and to these blocks parity bits are added. When errors are sufficiently randomly distributed, these parity bits can be used by the receiver to identify bits which are in error and allow correction.  

In links using copper twinax cables or direct detect optics, there are 2 types of FEC in use, Reed-Solomon and Firecode.  Reed-Solomon (RS) is a ‘stronger’ FEC and is most prevalent.  Firecode (FC), also known as BASE-R or Clause 74 FEC,  is a ‘weaker’ FEC but introduces less latency on the link when used in comparison to RS-FEC.  The strength of a FEC is typically characterized by the worst case BER that it can correct. Another characteristic of FEC is its ability to handle bursts of errors. That is errors, when averaged, do not exceed the worst case BER but can still cause uncorrectable errors. RS-FEC is better at handling bursts of errors than FC-FEC.  

Reed-Solomon FEC

RS-FEC has 2 variants in these use cases. The first is RS(514,528) which is used on links using Non Return to Zero (NRZ) encoding. These are links running at 25G, 50G-2 (2x25G lanes) and 100G-4 (4x25G lanes).   The second variant is RS(514,544) which is used on Pulse Amplitude Modulation 4-level (PAM4) encoded links. These are links running at 50G-1 (1x50G lane), 100G-2 (2x50G lanes), 200G-4 (4x50G lanes), and 400G-8 (8x50G lanes).   These can be further abbreviated as RS-528 and RS-544 respectively.   The difference between the two is literally the number of parity symbols carried to protect the data.  In RS-528 there are 528-514=14 parity symbols (140 parity bits).  RS-544 has 544-514=30 parity symbols (300 parity bits).  As a result of the additional parity information RS-544 is a stronger FEC

FEC Worst Case Correctable BER  Usage
Firecode 10-8

(1 error/108 bits)

  1. 25GbE/50-2GbE over copper twinax cables identified as CA-S or CA-N.*
  2. 25GbE over 25GBASE-SR/LR optics when the preferred RS-528 is unavailable
RS-528 10-5
  1. 25GbE/50-2GbE over copper twinax cables identified as CA-L*
  2. 25GbE over 25GBASE-SR/LR optics
  3. 100GbE over copper twinax cables and most optical media.
RS-544 10-4 PAM4 links (400G-8, 200G-4, 100G-2, 50G-1) for all media types.

*Please refer the 25G FAQ found on www.arista.com for more details.

Transmitting with RS-FEC

The RS-FEC transmission process builds a codeword which consists of 5140 bits of user data and adds 140 bits of parity.  This creates a block that is 5280 bits in size.  This codeword internally is arranged as 10 bit ‘symbols’.  When transmitting, these symbols are distributed round robin to FEC lanes which are then mapped to PMD lanes.  For a 100GbE over 4x25G serdes (100G-4) there are 4 FEC and 4 PMD lanes so the mapping is 1:1.  For 100G-2 there are 4 FEC lanes but 2 PMD lanes so each PMD lane will carry 2 FEC lanes.  The simplified transmission process is shown in the figure below. 

Transmission Process with RS-FEC Encoded Data

Receiving with RS-FEC

On reception the process is generally reversed.  The codeword is assembled from the incoming symbols, the parity bits are used to perform error correction and then the data is sent on for further receive processing.  The simplified process is shown below. 

Reception of RS-FEC Encoded Data

The correction algorithm identifies each bit which is in error and applies corrections.  It also counts each symbol in which a correction was made.  The correction process tracks the number of corrected codewords, the number of corrected symbols and, on some PHYs, the number of bits corrected.  

 

When Correction Fails

RS-FEC can make corrections only when the number of bits in error does not exceed limits. The ability to correct is based on the number of symbols containing bit errors in a given codeword. If the number of parity symbols is 2t, then RS-FEC can correct up to t symbols.  If more than t symbols contain errors then the codeword is uncorrectable.  For RS-528, 2t=14 and so t=7.  This means that as many as 70 bits (7 symbols * 10 bits per symbol) may be in error and the codeword can still be corrected.  If the bit errors are distributed 1 per symbol, as few as 8 bit errors can result in an uncorrected codeword since 8 > 7, the maximum number of symbols correctable in a codeword.  For RS-544, 2t=30 and so up to t=15 symbols may have errors in a codeword which is correctable. 

 

Monitoring RS-FEC

Arista EOS reports a rich set of parameters at layer 1 for monitoring link performance.  These parameters can be monitored to determine when a link is experiencing errors which may have not yet impacted application data.  When FEC is enabled, FEC codewords encapsulate all data transmitted. This includes PCS IDLEs transmitted when there is no user data. Because of this, FEC statistics can be used to determine if a link is performing well regardless of whether any packet data is transmitted across the link. 

Show Interfaces Phy Detail

The primary display for layer 1 monitoring and for FEC statistics is the ‘show interfaces phy detail’ command.  The parameters listed in the table below are the primary ones for monitoring link performance with FEC. 

 

Parameter Description Platform Support
FEC corrected codewords Count of codewords which had correctable errors in the last polling period.  All
FEC lane corrected symbols Count of symbols from correctable codewords which had bits corrected in the last polling period.  All
FEC corrected symbol rate Ratio of corrected symbols to total symbols received in the last polling period. Not accurate in periods in which uncorrectable errors are present. Also referred to as Symbol Error Rate (SER) in this document. 7280R2, 7280R3, 7500R2, 7500R3, 7800R3, 7060X4, 7368X4
Pre-FEC bit error rate Ratio of corrected bits to total bits received in the last polling period. Not accurate in periods in which uncorrectable errors are present. 7280R3, 7500R3, 7800R3, 7060X4, 7368X4
FEC uncorrected codewords  Count of codewords which could not be corrected in the last polling period. These represent potentially lost data All

 

On many platforms there are intermediate PHYs between the transceiver terminating the incoming link and the switch chip.   Often the internal links between the PHYs and the switch chip are protected by FEC.  When using FEC to determine the quality of a link between peers, the parameters collected on the PHY directly connected to the transceiver terminating the link should be used.  These are grouped under the ‘line’ parameters section of show int phy detail. For example, on ports 1-32 of a 7280CR3MK-32P4/D4 the interesting FEC parameters would be under the section labeled “CMS42550 line”.   

Internal link parameters are also shown in show int phy detail. These are in sections using the label “system” as in “CMS42550 system”. The system side parameters can be useful in confirming the internal system links are performing as expected. Errors in this section may require assistance from the Arista TAC to resolve.  

The drawing below can serve to orient one on where the statistics reported in show interfaces phy detail are collected. In general, it is the ‘line’ side statistics that are interesting for evaluating the quality of the link to the peer.

 

RS-FEC Statistics Collection Points

For both the system and the line side, after initial link up, there should be no uncorrectable errors in a well performing link.  The output below shows an example ‘show int phy detail’ display with the bulk of the sections unrelated to FEC omitted. This example shows a system with a PHY between the switch chip and the transceiver.   This results in 3 FEC decoder functions operating on this system. 

In the transmit (egress) direction there is a FEC decoder on the PHY associated with the system interface between the switch chip and the PHY identified as “CMS42550 system”.  This FEC receiver is decoding data transmitted by the switch chip. This data is retransmitted on the line side of the CMS42550 and will be received and decoded by the peer system.

In the receive (ingress) direction the line data from the transceiver is terminated on the CMS42550 with the data in the section identified as “CMS42550 line”.  

 

Arista#show int et29/1 phy detail
Current System Time: Mon Aug 31 19:54:51 2020
Ethernet29/1
                              Current State     Changes            Last Change
                              -------------     -------            -----------
  Interface state             up                     16            3:03:32 ago
...  
 BCM88690-TSCBH line
  Model                       BCM88690 (0x000000,0x25,0x1)
...
  Forward Error Correction    Reed-Solomon
  Reed-Solomon codeword size  544
  FEC alignment lock          ok                     31            3:03:35 ago
  FEC lane alignment marker lock          
    Lane 0                    ok                     31            3:03:35 ago
    Lane 1                    ok                     31            3:03:35 ago
    Lane 2                    ok                     31            3:03:35 ago
    Lane 3                    ok                     31            3:03:35 ago
  FEC corrected codewords     1                      30            0:51:42 ago
  FEC uncorrected codewords   0                       0                  never
  FEC corrected symbol rate   < 1.82E-11
  FEC lane corrected symbols  
...
    Lane 0                    1                       4            4:09:18 ago
    Lane 1                    0                       0                  never
    Lane 2                    1                      26            1:10:23 ago
    Lane 3                    1                       4            0:51:42 ago
  FEC lane mapping            
    FEC lane                  00 01 02 03
    PMA lane                  00 00 01 01
  Pre-FEC bit error rate      < 1.82E-12
...  
 CMS42550
  Model                       CMS42550 (B0)
  Firmware revision           01.90.91
 CMS42550 system
...
  Forward Error Correction    Reed-Solomon
  Reed-Solomon codeword size  544
  FEC alignment lock          ok                     15            3:03:35 ago
  FEC lane alignment marker lock          
    Lane 0                    ok                     15            3:03:35 ago
    Lane 1                    ok                     15            3:03:35 ago
    Lane 2                    ok                     15            3:03:35 ago
    Lane 3                    ok                     15            3:03:35 ago
  FEC corrected codewords     1                     126            0:00:26 ago
  FEC uncorrected codewords   3                      12            4:02:24 ago
  FEC corrected symbol rate   < 1.58E-11
  FEC lane corrected symbols  
    Lane 0                    5                       2            4:10:14 ago
    Lane 1                    1                      98            0:00:26 ago
    Lane 2                    0                       0                  never
    Lane 3                    1                      31            0:02:26 ago
  FEC lane mapping            
    FEC lane                  00 01 02 03
    PMA lane                  00 00 01 01
  Pre-FEC bit error rate      < 1.58E-12
... 
 CMS42550 line
...
 Forward Error Correction    Reed-Solomon
  Reed-Solomon codeword size  528
  FEC alignment lock          ok                     15            3:03:35 ago
  FEC lane alignment marker lock          
    Lane 0                    ok                     15            3:03:35 ago
    Lane 1                    ok                     15            3:03:35 ago
    Lane 2                    ok                     15            3:03:35 ago
    Lane 3                    ok                     15            3:03:35 ago
  FEC corrected codewords     1                       2            4:25:31 ago
  FEC uncorrected codewords   3                      12            4:02:24 ago
  FEC corrected symbol rate   < 1.63E-11
  FEC lane corrected symbols  
    Lane 0                    1                       2            4:25:31 ago
    Lane 1                    0                       0                  never
    Lane 2                    0                       0                  never
    Lane 3                    0                       0                  never
  FEC lane mapping            
    FEC lane                  00 01 02 03
    PMA lane                  00 01 02 03
  Pre-FEC bit error rate      < 1.63E-12
...

When uncorrectable errors are experienced the calculation of pre-FEC BER and SER is compromised.  This is because when uncorrectable, there is no information on how many bits or symbols are in error.  This is signaled in the ‘show int phy detail’ output by the appearance of an asterisk (‘*’) next to the value.  

During periods where there are no corrections (all FEC codewords were received perfectly with no bits to correct) the BER is noted as BER < ( 1 / bits in the period). 

Pre-FEC BER vs SER

As described above RS-FEC codewords are built from 10 bit symbols.  When a codeword is corrected the number of symbols which contain bits needing correction are counted.  The ratio of the corrected symbols to the total number of symbols is the symbol error rate (SER).  It is typical to have just 1 or 2 corrected bits in a symbol. In these cases the corrected symbol count is approximately equal to the corrected bit count. However, in the calculation of SER, given there are 10 bits per symbol,  the denominator is smaller by a factor of 10. With 1 bit error per symbol error SER = (10 * preFEC BER). This is often observed in the phy detail output. 

FEC Correction Histograms

As discussed above, RS-FEC can correct up to either 7 or 15 symbols per codeword.  A well performing link should have most corrections in the lower half of these limits and they should decay exponentially.  Often, links can operate with nearly all corrected codewords needing only 1 or 2 symbols corrected.  The number requiring 3 should be an order of magnitude lower, and 4 another order of magnitude lower.   

Some platforms support collecting a histogram of symbol corrections per codeword and displaying it. This histogram can be used to determine if the link is reaching the limits of RS-FEC to correct the errors on the link.  By monitoring this over time, links which are degrading can be identified and corrective action taken prior to a service affecting event.  This histogram can be displayed with the command “show interfaces phy diag error-correction histogram’.  An example is shown below. 

 

Arista#show int et3/1 phy diag error-correction histogram 
 Ethernet3/1
  Symbol Errors Per Codeword  Codewords         Changes            Last Change
  --------------------------  ---------         -------            -----------
 CRT50216 system
  Bin0                        4075208478588       18078            0:00:01 ago
  Bin1                        7                       7     1 day, 4:42:37 ago
  Bin2                        0                       0                  never
  Bin3                        0                       0                  never
  Bin4                        0                       0                  never
  Bin5                        0                       0                  never
  Bin6                        0                       0                  never
  Bin7                        0                       0                  never
  Bin8                        0                       0                  never
  Bin9                        0                       0                  never
  Bin10                       0                       0                  never
  Bin11                       0                       0                  never
  Bin12                       0                       0                  never
  Bin13                       0                       0                  never
  Bin14                       0                       0                  never
  Bin15                       0                       0                  never
  Bin16+                      0                       0                  never
 CRT50216 line
  Bin0                        4077816452204       18084            0:00:01 ago
  Bin1                        259686               4260            0:00:01 ago
  Bin2                        1155                  195            0:00:24 ago
  Bin3                        19                     13            0:04:27 ago
  Bin4                        0                       0                  never
  Bin5                        0                       0                  never
  Bin6                        0                       0                  never
  Bin7                        0                       0                  never
  Bin8+                       0                       0                  never

 

In the above display there are two sections. The first is labeled “CRT50216 system” and this is data for the internal link between the PHY and the switch chip. This is data that is destined for the peer system.  The section “CRT50216 line” provides histogram data for the FEC engine receiving data from the peer system. This is the section that is interesting for monitoring the performance of an optical fiber link.  

 

The internal system side link consists of 50G PAM4 lanes and is protected by RS-544 which can correct up to 15 symbols per codeword. When a codeword is corrected the bin corresponding to the number of symbols corrected is incremented. Bin0 counts codewords with no bit errors at all.  Bin1 counts codewords with 1 symbol corrected – this could be as few as 1 bit and as many as 10 bits.  Bin 2 counts 2 symbol corrections per codeword (at least 2, but as many as 20 bits corrected), etc.  If 16 or more symbols have errors, the codeword is uncorrectable. In these cases “Bin16+” is incremented.   

 

The line side link above consists of 25G NRZ lanes protected by RS-528 FEC. RS-528 can correct up to 7 symbols per codeword. Again, “Bin0” counts received codewords with no error, “Bin1” those with 1 error, etc. If a codeword is uncorrectable “Bin8+” is incremented. 

 

This example shows that the system is operating with the overwhelming majority of traffic requiring no error correction (Bin0).  The line side shows corrections up to Bin3 for a very small percentage (<1 out of 10 million) of the received codewords. Also indicated are a small number of correctable single symbol errors on the interface between the CMS42250 PHY and the switch chip – in other words, a healthy path.

 

Below shows an example of a link which has experienced uncorrectable errors where bin8+ has accumulated counts of uncorrectable errors.  

 

Arista#show int Et81/1 phy diag error-correction histogram | nz
 Ethernet81/1
  Symbol Errors Per Codeword  Codewords         Changes            Last Change
  --------------------------  ---------         -------            -----------
 CRT50216 system
  Bin0                        53066018469           236            0:00:01 ago
 CRT50216 line
  Bin0                        52515664515           235            0:00:01 ago
  Bin1                        157748                  3            0:16:00 ago
  Bin2                        41705                   3            0:16:00 ago
  Bin3                        7279                    3            0:16:00 ago
  Bin4                        1052                    3            0:16:00 ago
  Bin5                        79                      3            0:16:00 ago
  Bin6                        11                      3            0:16:00 ago
  Bin7                        1178                    3            0:16:00 ago
  Bin8+                       35597                   2            0:16:12 ago

Histogram for a Link with Uncorrectable Errors

 

The histogram data can be cleared with the “clear phy counters” command.

The table below shows platform support for displaying FEC histogram information.   Note that all platform support is at the time of this writing.  Additional platforms may be added in the future.  

 

Platform Support Ports Speeds
7500R3-36CQ-LC, 7500R3K-36CQ-LC 1-12, 25-36 100G-4, 50G-2
DCS-7280CR3-32P4,

DCS-7280CR3-32D4, DCS-7280CR3K-32D4, DCS-7280CR3K-32D4

1-32 100G-4, 50G-2
DCS-7280CR3-96,

DCS-7280CR3K-96-F

1-96 100G-4, 50G-2
DCS-7280CR3MK-32P4, DCS-7280CR3MK-32D4 1-32 100G-4, 50G-2, 40G, 25G, 10G
7800R3-36DM-LC 1-36 400G-8, 200G-4, 100G-2, 100G-4, 50G-1, 50G-2, 40G, 25G, 10G

 

Firecode FEC

Firecode FEC is much less widely used than RS-FEC.  The main advantage of FC-FEC over RS-FEC is that it has somewhat lower latency; RS-FEC at 25G adds 250ns of latency, while FC-FEC adds about 80ns.  Transmission over single mode fiber adds about 5ns of latency per meter of fiber. So a fiber run of over 50 meters will add more latency than RS-FEC.  The primary use case for FC-FEC is over short twinax cables where the propagation latency is less significant.  It can also be used at 25G and 50G with fiber optic connections when the preferred RS-FEC is not available.  For 25G and 50G-2 links, if latency is an important factor to optimize, it is recommended to engineer the links to allow running with FEC disabled.  For instance, use twinax cables designated as CA-N. 

 

When transmitting with FC-FEC enabled, the PHY will again encode the data into blocks, adding parity bits. In contrast to RS-FEC, FC-FEC may have multiple independent lanes of FEC running and the number of FEC lanes may not match the number of PMD lanes.  50G-2 links running with FC-FEC have 2 PMD lanes. However, FC-FEC encodes 4 FEC lanes and then muxes data from 2 FEC lanes to each PMD lane. The result is that there are 4 sets of counters for a 50G-2 link.  This can be seen in the output below.  Note also that FC-FEC implementations do not provide counters which may be used to generate a pre-FEC BER or SER.  The only counters available are the corrected and uncorrected block counters. 

 

Arista#show int et6/1 phy detail
Current System Time: Wed Sep  2 18:43:22 2020
Ethernet6/1
                              Current State     Changes            Last Change
                              -------------     -------            -----------
  Interface state             up                      4            0:01:27 ago
...
 BCM56965-TSCF line
  Model                       BCM56965-TSCF (A) (0x00c086,0x37,0x0)
  ...
  Forward Error Correction    Fire-Code
  FEC corrected blocks
    Lane 0                    0                       0                  never
    Lane 1                    0                       0                  never
    Lane 2                    0                       0                  never
    Lane 3                    0                       0                  never
  FEC uncorrected blocks
    Lane 0                    0                       0                  never
    Lane 1                    0                       0                  never
    Lane 2                    0                       0                  never
    Lane 3                    0                       0                  never
...
  

 

400G Histogram Examples

This first set of output below shows the line side BER as measured on the CMS50216 PHY and the corresponding histogram output. This output was collected from a live link running 400G over a 400GBASE-DR4 transceiver with a short fiber run. This link is quite healthy exhibiting good margin in both raw BER and corrections per codeword.  On the CMS50216 line side, this 400G link receives most codewords without needing any symbol corrections (Bin0).  Codewords with 1 symbol corrected (Bin1), is over 3 orders of magnitude lower meaning less than 1 in 1000 codewords needs any correction at all. 

 

Arista#show int et9/1/1 phy detail
   ...
CMS50216 line
  FEC corrected symbol rate   9.45E-07
  Pre-FEC bit error rate      9.45E-08
Arista#show int et9/1/1 phy diag error-correction histogram
 Ethernet9/1/1
  Symbol Errors Per Codeword  Codewords         Changes            Last Change
  --------------------------  ---------         -------            -----------
 CMS50216 system
  Bin0                        904417222306         4818            0:00:00 ago
  Bin1                        1                       1            0:26:14 ago
  ...
 CMS50216 line
  Bin0                        903668687198         4816            0:00:00 ago
  Bin1                        142901852            4816            0:00:00 ago
  Bin2                        300529               4816            0:00:00 ago
  Bin3                        3313                 1352            0:00:00 ago
  Bin4                        109                   100            0:00:29 ago
  Bin5                        7                       7            0:08:53 ago
  Bin6                        1                       1            0:40:19 ago
  Bin7                        0                       0                  never
  Bin8                        0                       0                  never
  Bin9                        0                       0                  never
  Bin10                       0                       0                  never
  Bin11                       0                       0                  never
  Bin12                       0                       0                  never
  Bin13                       0                       0                  never
  Bin14                       0                       0                  never
  Bin15                       0                       0                  never
  Bin16+                      0                       0                  never

 

This next example shows a link which is experiencing uncorrectable errors which were injected by a test device. This is reflected both in the BER/SER output and the histogram. Note that in the face of uncorrectable errors, the rates are both marked with ‘*’ and Bin16+ of the histogram is accumulating. 

 

Arista#show int et9/1/1 phy detail
...
 CMS50216 line
  FEC corrected symbol rate   7.63E-04*
  Pre-FEC bit error rate      7.68E-05*
Arista#show int et9/1/1 phy diag error-correction histogram
 Ethernet9/1/1
  Symbol Errors Per Codeword  Codewords         Changes            Last Change
  --------------------------  ---------         -------            -----------
 CMS50216 system
  Bin0                        2237247510              3            0:00:01 ago
  ...
 CMS50216 line
  Bin0                        1503627389              3            0:00:01 ago
  Bin1                        595041361               3            0:00:01 ago
  Bin2                        92371309                3            0:00:01 ago
  Bin3                        21267493                3            0:00:01 ago
  Bin4                        8399044                 3            0:00:01 ago
  Bin5                        4493216                 3            0:00:01 ago
  Bin6                        74908                   3            0:00:01 ago
  Bin7                        62796                   3            0:00:01 ago
  Bin8                        21772                   3            0:00:01 ago
  Bin9                        39833                   3            0:00:01 ago
  Bin10                       10754                   1            0:00:21 ago
  Bin11                       6335                    1            0:00:21 ago
  Bin12                       56098                   1            0:00:21 ago
  Bin13                       7402                    1            0:00:21 ago
  Bin14                       44176                   1            0:00:21 ago
  Bin15                       26242                   1            0:00:21 ago
  Bin16+                      14994                   1            0:00:21 ago

 

Summary

Forward error correction is an important component of high speed signaling; it allows error free operation of media that is not inherently error free.  This allows for lower cost optics and in some cases, use of lower cost fiber.  For copper cables, it extends the reach or allows the same reach with a smaller gauge cable. Such cables are less expensive and easier to manage. As these media are designed for use with FEC enabled, correctable errors are expected and are no reason for alarm.  

 

Users should monitor links for FEC uncorrected codewords as these do indicate a problem if observed on links after initial link up. Increasing pre-FEC bit error rate or links which begin to experience new higher number bin counts in FEC histograms should be investigated.  These may predict links which will experience uncorrected FEC codewords in the future. 

References

  1. https://eos.arista.com/eos-4-22-0f/clear-phy-counters
  2. https://eos.arista.com/eos-4-23-2f/fec-traffic-analyzer/
Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: