Posted on March 22, 2021 3:53 am
 |  Asked by Gabor Ivanszky
 |  128 views
0
0
Print Friendly, PDF & Email

Hi,

 

we realized that we have non-intended ECMP on a DCS-7060CX-32S because of BGP path only different in their cluster list length.

We have added the “bgp bestpath tie-break cluster-list-length”, did a soft BGP reset, but the situation is the same. Should we do anything else to active the “bgp bestpath tie-break cluster-list-length” setting?

 

* >Ec 2.58.168.0/22 217.113.61.226 0 160 0 42864 ?
* ec 2.58.168.0/22 217.113.61.56 0 160 0 42864 ? Or-ID: 217.113.61.224 C-LST: 217.113.61.188
* >Ec 5.56.32.0/24 217.113.61.226 0 160 0 41075 i
* ec 5.56.32.0/24 217.113.61.56 0 160 0 41075 i Or-ID: 217.113.61.224 C-LST: 217.113.61.188
* >Ec 5.56.39.0/24 217.113.61.226 0 160 0 41075 i
* ec 5.56.39.0/24 217.113.61.56 0 160 0 41075 i Or-ID: 217.113.61.224 C-LST: 217.113.61.188

0
Posted by Naveen Chandra
Answered on March 25, 2021 3:08 am

Hi Gabor,

Thanks for posting in this forum. While seeing the output you have shared, it looks like the ECMP is enabled (“maximum-paths†value is more than 1) under router bgp in your device. Could you please check and confirm the same?

If yes, you would like to make it default "default maximum-paths".

Also could you please let me know the Arista Platform and EoS version you are using? Please also find the below document explaining the BGP best-path selection algorithm in detail.

https://eos.arista.com/eos-4-25-0f/bgp-best-path/

0
Posted by Keerthi Bharathi
Answered on August 20, 2021 2:27 am

Hello Gabor, 

I will try to explain what Naveen has mentioned using a lab to understand how having maximum-paths and bgp bestpath tie breaker works. 

1. Consider the following topology. iBGP peering is done on these devices. A network 11.11.11.11/32 is advertised in BGP on Sw1. Sw3 is a route-reflector with Sw1 and Sw2 acting as clients.

Config: 

Sw1:
router bgp 1
neighbor 192.168.12.2 remote-as 1
neighbor 192.168.12.2 next-hop-self
neighbor 192.168.13.3 remote-as 1
neighbor 192.168.13.3 next-hop-self
network 11.11.11.11/32

Sw2:
router bgp 1
neighbor 192.168.12.1 remote-as 1
neighbor 192.168.12.1 next-hop-self
neighbor 192.168.23.3 remote-as 1
neighbor 192.168.23.3 next-hop-self

Sw3:
router bgp 1
neighbor 192.168.13.1 remote-as 1
neighbor 192.168.13.1 next-hop-self
neighbor 192.168.13.1 route-reflector-client
neighbor 192.168.23.2 remote-as 1
neighbor 192.168.23.2 next-hop-self
neighbor 192.168.23.2 route-reflector-client

2. When we check the show ip bgp on Sw2 we would see that it has learnt the network 11.11.11.11/32 from Sw1 and Sw3. As per point 14.1.2 in the https://eos.arista.com/eos-4-25-0f/bgp-best-path/, the preferred path is the one learnt from Sw1 since it has the shortest cluster list length. 

Sw2#show ip bgp 11.11.11.11/32
BGP routing table information for VRF default
Router identifier 192.168.23.2, local AS number 1
BGP routing table entry for 11.11.11.11/32
Paths: 2 available
Local
192.168.12.1 from 192.168.12.1 (11.11.11.11) <<<< Sw1
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:01:21 ago, valid, internal, best
Rx SAFI: Unicast
Local
192.168.23.3 from 192.168.23.3 (192.168.23.3) <<<< Sw3
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:01:21 ago, valid, internal
Originator: 200.0.0.200, Cluster list: 192.168.23.3
Rx SAFI: Unicast

3. Let's configure the maximum-paths command on Sw2. 

router bgp 1
maximum-paths 2
neighbor 192.168.12.1 remote-as 1
neighbor 192.168.12.1 next-hop-self
neighbor 192.168.23.3 remote-as 1
neighbor 192.168.23.3 next-hop-self

4. The show ip bgp for the network 11.11.11.11/32 shows that the routes learnt from both Sw1 and Sw2 are valid. We would see both as next hops in the “show ip route†for that network.

Sw2#show ip bgp 11.11.11.11/32
BGP routing table information for VRF default
Router identifier 192.168.23.2, local AS number 1
BGP routing table entry for 11.11.11.11/32
Paths: 2 available
Local
192.168.12.1 from 192.168.12.1 (11.11.11.11) <<< Sw1
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:06:22 ago, valid, internal, ECMP head, ECMP, best, ECMP contributor
Rx SAFI: Unicast
Local
192.168.23.3 from 192.168.23.3 (192.168.23.3) <<< Sw3
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:06:23 ago, valid, internal, ECMP, ECMP contributor
Originator: 11.11.11.11, Cluster list: 192.168.23.3
Rx SAFI: Unicast

Sw2show ip route 11.11.11.11/32

VRF: default
Codes: C - connected, S - static, K - kernel,
O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
NG - Nexthop Group Static Route, V - VXLAN Control Service,
DH - DHCP client installed default route, M - Martian,
DP - Dynamic Policy Route, L - VRF Leaked,
G - gRIBI, RC - Route Cache Route

B I 11.11.11.11/32 [200/0] via 192.168.23.3, Ethernet1/1
via 192.168.12.1, Ethernet34/1

5. Since the paths are equally preferred they would form an ECMP group. We see the ECMP head is the route learnt from Sw1, meaning when Sw2 has to advertise this network(11.11.11.11/32) to its BGP neighbors, it would use this prefix to advertise. With no tie breaker set, the route which is first received becomes the ECMP head. 

For eg: if I shut down the neighbor on Sw1 and do a no shut, the ECMP head would now be the route learnt from Sw3. 

Sw1(config-router-bgp)#neighbor 192.168.12.2 shut
Sw1(config-router-bgp)#no neighbor 192.168.12.2 shut

Sw2#show ip bgp 11.11.11.11/32
BGP routing table information for VRF default
Router identifier 192.168.23.2, local AS number 1
BGP routing table entry for 11.11.11.11/32
Paths: 2 available
Local
192.168.23.3 from 192.168.23.3 (192.168.23.3) <<< Sw3
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:14:32 ago, valid, internal, ECMP head, ECMP, best, ECMP contributor
Originator: 11.11.11.11, Cluster list: 192.168.23.3
Rx SAFI: Unicast
Local
192.168.12.1 from 192.168.12.1 (11.11.11.11) <<< Sw1
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:00:19 ago, valid, internal, ECMP, ECMP contributor
Rx SAFI: Unicast

6. When we configure “bgp bestpath tie-break cluster-list-length†on Sw2, we notice that the ECMP head now becomes the prefix learnt from neighbor which has the shortest Cluster-list length.

Sw2:
router bgp 1
maximum-paths 2
bgp bestpath tie-break cluster-list-length
neighbor 192.168.12.1 remote-as 1
neighbor 192.168.12.1 next-hop-self
neighbor 192.168.23.3 remote-as 1
neighbor 192.168.23.3 next-hop-self

Sw2#sh ip bgp 11.11.11.11/32 detail <<<
BGP routing table information for VRF default
Router identifier 192.168.23.2, local AS number 1
Route status: [a.b.c.d] - Route is queued for advertisement to peer.
BGP routing table entry for 11.11.11.11/32
Paths: 2 available
Local
192.168.12.1 from 192.168.12.1 (11.11.11.11) <<<Sw1
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:19:09 ago, valid, internal, ECMP head, ECMP, best, ECMP contributor
Rx SAFI: Unicast
Local
192.168.23.3 from 192.168.23.3 (192.168.23.3) <<< Sw3
Origin IGP, metric 0, localpref 100, IGP metric 1, weight 0, received 00:45:27 ago, valid, internal, ECMP, ECMP contributor
Originator: 11.11.11.11, Cluster list: 192.168.23.3
Rx SAFI: Unicast
Not best: Cluster list length tie-break configured
Not advertised to any peer

Also, if we have a number of available equally preferred paths more than the maximum-path configured, the paths which have shorter cluster list length would be seen as ECMP contributors thereby installed in the routing table.

Hope this helps.

Ref:https://eos.arista.com/eos-4-25-0f/bgp-best-path/

Post your Answer

You must be logged in to post an answer.