MLAG ISSU

Overview

MLAG ISSU (In-Service Software Upgrade) upgrades EOS software on one MLAG peer with minimal traffic disruptions on active MLAG interfaces and without changing the network topology.

Note: Traffic impact could be seen for orphan links, active partial links and packets in flight

image02

 

MLAG considerations before upgrade

 

I. Check for configuration inconsistencies

Following features should be configured consistently on each switch:

  • VLANs
  • Switchport configuration on port channel interfaces that are configured with an MLAG ID
  • STP configuration (global)

In EOS versions 4.15.2F onwards, we can use MLAG configuration check feature:

https://eos.arista.com/eos-4-15-2f/mlag-config-check/

 

II. Resolve ISSU warnings

Resolve the following warnings before performing the upgrade:

1. Active-partial MLAG warning:

Example:

switch#show mlag issu warning

The following MLAGs are not in Active mode. Traffic to or from these ports will be lost during the upgrade process.

                                                                            local/remote

   mlag       desc         state                  local        remote           status

----------   ----------   -----------------    -----------   ------------  ------------

 10         MLAG-10        active-partial           Po10        Po10           up/down 


Since all the links in the MLAG port-channel are not active, to avoid packet loss during the reload, bring up the the remote port-channel/link on peer. If the MLAG is not actively used or the link is expected to be down, then this warning can be ignored.

 

2. STP agent is not restartable:

Example:

switch#show mlag issu warning

Stp is not restartable. Topology changes will occur during the upgrade process.

The above warning could show up if the peer has just reloaded. Wait for the Stp agent to be restartable (typically 30 seconds. It approximately takes 120 seconds for a newly started Stp agent).

In addition, another common reasons that the STP agent is not restartable are:

  1. Due to an STP event: This is normally seen when the STP topology on the switches is not stable. Check for ports constantly moving to discarding and if present, resolve the issue with those ports. Also, wait to make sure the situation is resolved for at least 2 minutes.
  2. STP Configuration mismatch: Check if the STP configuration matches on each peer Eg: Bridge priority, STP mode etc.

To check if the agent is restartable:

switch# show spanning-tree bridge detail

Stp agent is restartable

 

3. Configured reload delay is too low:

Example:

switch#show mlag issu warning
The configured reload delay of 100 seconds is below the default value of 300 seconds. A longer reload delay allows more time to rollback an unsuccessful upgrade due to incompatibility.

It is recommended to configure a reload delay value greater than or equal to the default. The default delay period varies by switch type.

 

4. Peer has errdisabled interfaces:

Example:

switch#show mlag issu warning

The other MLAG peer has errdisabled interfaces. Traffic loss will occur during the upgrade process.

The above warning is usually seen in cases where the peer has been reloaded and the MLAG reload delay timer is still active on that peer when issuing the “show mlag issu warning” command on the local switch. To avoid packet loss, wait for the reload-delay timer (default of 300 seconds) to expire on the peer.

 

5. Image compatibility check for new and current EOS version:

Example:

switch#show mlag issu warning

If you are performing an upgrade, and the Release Notes for the new version of EOS indicate that MLAG is not backwards-compatible with the currently installed version (4.12.4), the upgrade will result in packet loss.

The above is an informative warning and we strongly recommend to refer the release notes of the code that the device will be upgraded to for compatibility validation.

Starting 4.15.2, one mechanism to compare the new and existing image is via MLAG ISSU compatibility detection:

https://eos.arista.com/eos-4-15-2f/automatic-mlag-issu/

If the images are not compatible, please refer the release notes of the new image.

 

III. Choose the correct upgrade code path

  • Check the release notes of the image that the switch will be upgraded to [Under MLAG ISSU]
  • Confirm if the image is directly compatible (Check if the existing image is present under the “Compatible EOS version” column)
  • If the existing and new versions are compatible, proceed with direct MLAG ISSU upgrade
  • If the existing image is not present as compatible, choose an image that is compatible with both the existing and new image and  perform a Mlag ISSU upgrade/downgrade via: Existing → Interim → New

Example 1: Upgrading from EOS version 4.14.9M to 4.15.5M

Check the release notes for 4.15.5M under MLAG ISSU Compatibility Matrix

Confirm if 4.14.9M is directly compatible with 4.15.5M

Example table:

Screen Shot 2016-06-20 at 16.44.26

Since the code [4.14.9] is not immediately compatible, check the common code that would be compatible for both version.

Screen Shot 2016-06-20 at 16.46.01

In the above example the upgrade procedure would be:

4.14.9M → 4.14.10M → 4.15.5M

 

Example 2: Upgrading from EOS version 4.13.12M to 4.15.5M

Check the release notes for 4.15.5F under MLAG ISSU Compatibility Matrix

Confirm if 4.13.12M is directly compatible with 4.15.5M

Eg table:

Screen Shot 2016-06-20 at 16.49.57

 

Since 4.13.12M is directly compatible with 4.15.5M, the upgrade path will be:

 4.13.12M → 4.15.5M

 

Upgrade Procedure

The following procedure performs an MLAG ISSU upgrade:

Step 1: Verify MLAG status

Peer1#show mlag
MLAG Configuration:
domain-id           :                   1
local-interface     :            Vlan4094
peer-address        :             1.1.1.1
peer-link           :    Port-Channel1000
peer-config         :          consistent
MLAG Status:
state               :              Active
negotiation status  :           Connected
peer-link status    :                  Up
local-int status    :                  Up
system-id           :   02:1c:73:90:32:1b
MLAG Ports:
Disabled            :                   0
Configured          :                   0
Inactive            :                   0
Active-partial      :                   0
Active-full         :                   0
Peer1#show spanning-tree bridge detail 
Stp agent is restartable

 

Step 2: Load the new image on one of the peers and point it as the boot image

Peer1#copy <source_file_path> flash:EOS-<Version>.swi
Peer1(config)#boot system flash:EOS-<Version>.swi

 

Step 3: Verify version compatibility between the new and existing image

Starting 4.15.2F:

Peer1#show mlag issu compatibility flash:EOS-4.15.6F.swi
/mnt/flash/EOS-4.15.6M.swi (4.15.6M) is MLAG ISSU compatible with the
current image (4.15.4F).

In addition, please check the Release Notes for further validation

 

Step 4: Check for configuration inconsistencies

 

Step 5: Resolve ISSU warnings

Peer1#show mlag issu warnings
If you are performing an upgrade, and the Release Notes for the new

version of EOS indicate that MLAG is not backwards-compatible with the

currently installed version (4.15.4F), the upgrade will result in

packet loss.

[All other warnings have been resolved]

 

Step 6: Reboot the peer

Peer1#reload
Proceed with reload? [confirm]

 

Step 7: Wait for peers to renegotiate to active state and for the reload delay timer to expire on rebooted peer. Avoid configuration changes on both peers after this step.

fm378#show mlag detail

MLAG Configuration:
domain-id           :                   1
local-interface     :            Vlan4094
peer-address        :             1.1.1.1
peer-link           :    Port-Channel1000
peer-config         :          consistent
MLAG Status:
state               :     Inactive/Reload (0:03:00 left) → Reload delay timer is still in effect.
negotiation status  :          Connecting
peer-link status    :      Lowerlayerdown
local-int status    :                  Up

 

Step 8: Repeat Steps 1- 6 for the other peer

 

Step 9: Confirm overall MLAG status

Peer1(config)#show mlag
MLAG Configuration:
domain-id           :                   1
local-interface     :            Vlan4094
peer-address        :             1.1.1.1
peer-link           :    Port-Channel1000
peer-config         :          consistent
MLAG Status:
state               :              Active 

Peer2(config)#show mlag 
MLAG Configuration: 
domain-id           :                   1 
local-interface     :            Vlan4094 
peer-address        :             1.1.1.2 
peer-link           :    Port-Channel1000 
peer-config         :          consistent 
MLAG Status:
state               :              Active