Posted on September 11, 2020 5:33 pm
 |  Asked by Mencken Davidson
 |  63 views
0
0
Print Friendly, PDF & Email

Same issue as described by someone else in: https://eos.arista.com/forum/issue-with-symmetric-irb/  The original post was marked as “resolved”, but with no indication as to *how* it was resolved.   Please advise?

I see the same behavior the most recent releases (4.24.2F / 4.24.2.1F)

I can actually see it working correctly end-to-end, but only for about 10-packets or so, after which it reverts to behavior described by the original poster.

Any assistance greatly appreciated.

0
Posted by Tamas Plugor
Answered on September 11, 2020 5:43 pm

Hi Mencken,

There's a known issue which will be fixed in an upcoming EOS release that you might be hitting ( bug481983 ) if you're using multiple VRFs

In a cEOS-lab with dynamic EVPN Type 5 route exhange or static EVPN route configuration, a host cannot reach another host in a different subnet. EOS maps the tenant VRF to a internal IP-less VLAN. The Linux kernel enables RPF check on all the interfaces by default. Because the internal VLAN used by EVPN is IP-less, it drops all IP packet by default.

The workaround is to disable the Linux RPF check for all the internal VLANs used by EVPN.
Do the following on each EVPN VTEP:

Step1: Determine the list of the EVPN internal vlans:

#sh vlan dynamic | grep evpn
evpn 4094

Alternative, you can find the vlan ID from vrf name -> VNI -> vlan ID
Search for '<<<<<'

VTEP1(vrf:bedrock)#sh running-config interfaces vxlan 1
interface Vxlan1
vxlan source-interface Loopback1
vxlan udp-port 4789
vxlan vlan 750 vni 10750
vxlan vlan 751 vni 10751
vxlan vrf bedrock vni 10000 <<<<<
VTEP1(vrf:bedrock)#sh interfaces vxlan 1
Vxlan1 is up, line protocol is up (connected)
Hardware is Vxlan
Source interface is Loopback1 and is active with 66.66.66.1
Replication/Flood Mode is headend with Flood List Source: EVPN
Remote MAC learning via EVPN
VNI mapping to VLANs
Static VLAN to VNI mapping is
[750, 10750] [751, 10751]
Dynamic VLAN to VNI mapping for 'evpn' is
[4094, 10000] <<<<<
Note: All Dynamic VLANs used by VCS are internal VLANs.
Use 'show vxlan vni' for details.
Static VRF to VNI mapping is
[bedrock, 10000]
Headend replication flood vtep list is:
750 66.66.66.3
MLAG Shared Router MAC is 0000.0000.0000

Step 2: List the RPF value

VTEP2(vrf:bedrock)(config)#bash

Arista Networks EOS shell

bash-4.2# sysctl -a | grep rp_filter | grep 4094
net.ipv4.conf.vlan4094.rp_filter = 2////////In case of not-working

Step3: Disable RPF for each EVPN internal vlan on for each VTEP:
Note: Repeat this step each time after reload the switch as well.

bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/vlan4094/rp_filter
bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/all/rp_filter

Step4: Verify RPF is reset properly

bash-4.2# sysctl -a | grep vlan4094.rp_filter
net.ipv4.conf.vlan4094.rp_filter = 0////////////////For working ping

Let us know if that helps!
Thanks,
Tamas

0
Posted by Mencken Davidson
Answered on September 11, 2020 7:51 pm

Thanks, Tamas.

I see the dynamic VLAN mapping, but in the bash shell there's noting* configured for the dynamic VLAN interface in /proc/sys/net.  sysctl -ar doesn't return anything for the vlan in question, and /proc/sys/net/ipv4/conf doesn't have an directory for the dynamic vlan.

--------------------------------------------------------------------------------------------

TPT-LSWA10-020921#sh vxlan vni source evpn
VNI to VLAN Mapping for Vxlan1
VNI VLAN Source Interface 802.1Q Tag
--------- ---------- ------------ --------------- ----------

VNI to dynamic VLAN Mapping for Vxlan1
VNI VLAN VRF Source
----------- ---------- ----------- ------------
50001 4049 evpn1 evpn

TPT-LSWA10-020921#bash

Arista Networks EOS shell

bash-4.2# sysctl -ar vlan4049
bash-4.2# sysctl -ar vlan | grep rp_filter
net.ipv4.conf.vlan2500.arp_filter = 0
net.ipv4.conf.vlan2500.rp_filter = 0
net.ipv4.conf.vlan2513.arp_filter = 0
net.ipv4.conf.vlan2513.rp_filter = 0
net.ipv4.conf.vlan3500.arp_filter = 0
net.ipv4.conf.vlan3500.rp_filter = 0
net.ipv4.conf.vlan3510.arp_filter = 0
net.ipv4.conf.vlan3510.rp_filter = 0
net.ipv4.conf.vlan3511.arp_filter = 0
net.ipv4.conf.vlan3511.rp_filter = 0
net.ipv4.conf.vlan3512.arp_filter = 0
net.ipv4.conf.vlan3512.rp_filter = 0
net.ipv4.conf.vlan3513.arp_filter = 0
net.ipv4.conf.vlan3513.rp_filter = 0
net.ipv4.conf.vlan3514.arp_filter = 0
net.ipv4.conf.vlan3514.rp_filter = 0
net.ipv4.conf.vlan4080.arp_filter = 0
net.ipv4.conf.vlan4080.rp_filter = 0
net.ipv4.conf.vlan4081.arp_filter = 0
net.ipv4.conf.vlan4081.rp_filter = 0
sysctl: reading key "net.ipv6.conf.vlan2500.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan2513.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3500.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3510.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3511.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3512.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3513.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan3514.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan4080.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan4081.stable_secret"
bash-4.2#

--------------------------------------------------------------------------------------------

Any other suggestions/advice??

0
Posted by Tamas Plugor
Answered on September 14, 2020 11:26 pm

Hi Mencken,

It seems like you are in the default vrf, can you switch context to the IP VRF and go to bash after that and check the sysctl commands?

 

bash-4.2# sysctl -a | grep rp_filter | grep 4094
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.cpu.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.eth0.stable_secret"
sysctl: reading key "net.ipv6.conf.eth1.stable_secret"
sysctl: reading key "net.ipv6.conf.eth3.stable_secret"
sysctl: reading key "net.ipv6.conf.fabric.stable_secret"
sysctl: reading key "net.ipv6.conf.fwd0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
sysctl: reading key "net.ipv6.conf.lo0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo1.stable_secret"
sysctl: reading key "net.ipv6.conf.txraw.stable_secret"
sysctl: reading key "net.ipv6.conf.vx1.stable_secret"
sysctl: reading key "net.ipv6.conf.vxlan.stable_secret"
bash-4.2# exit
logout

Leaf2#cli vrf tenant-blue

Leaf2(vrf:tenant-blue)#bash

Arista Networks EOS shell

bash-4.2# sysctl -a | grep rp_filter | grep 4094
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.fwd0.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan1515.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan4094.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan50.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan55.stable_secret"
sysctl: reading key "net.ipv6.conf.vlan60.stable_secret"
net.ipv4.conf.vlan4094.arp_filter = 0
net.ipv4.conf.vlan4094.rp_filter = 1
bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/vlan4094/rp_filter
bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/all/rp_filter

Thanks,
Tamas

0
Posted by Tamas Plugor
Answered on September 15, 2020 1:18 am

Created a quick demo using docker-topo which might be also useful:
https://github.com/noredistribution/labs/tree/master/ceos-lab-evpn-irb

HTH,

Tamas

0
Posted by Mencken Davidson
Answered on September 15, 2020 2:02 am

Thanks, Tamas!

Issue resolved for me :)   A couple of stray observations in case anyone else is running into the issue:

(1) I think you're running a pretty different version of cEOS than I am, because:  (a) you use "cli" while I need to use "Cli", and I can't invoke the cli with a vrf argument; and (2) your sysctl output is different from mine  (I get a bunch of spurious "sysctl: reading..." lines in my output.  I wasn't sure if you were just grooming your output, or if you have a really different version you're working from.

(2) Since I couldn't invoke Cli with the vrf argument, I just invoked bash shell specifying the corresponding netns.   ("ip netns list" from cEOS root-level bash or CLI-invoked bash  --either one seemed to work just fine.  Then I raun "ip netns exec [netns-name] bash")  (The naming pattern seems to be to prepend the EOS-configured VRF name with "ns-" when CEOS creates the associated netns.)

(3) Myself and others had observed that when this defect was active, we *would* see the incoming traffic when performing a tcpdump on the dynamic VLAN mapped to the L3VNI.   I notice now that I *don't* see the traffic on the same tcpdumps  (but now I *do* see it actually egressing the destination VLAN and actually working end-to-end, so I'm not complaining.)

And seriously, thanks again.   Having a viable CEOS platform for testing EVPN symmetric IRB in conjunction with the recently viable routed dot1q subinterfaces moves cEOS from "toy" to "extremely useful tool" in my environments.

Out of curiosity, where can I find the bug (bug481983 ) you mentioned documented?  Also, is

0
Posted by Tamas Plugor
Answered on September 15, 2020 9:22 am

Hi Mencken,

Glad to hear you were able to make it work and find it useful!

I was using the same EOS version as you did. Is there an error message that you were getting? Note that cli commands in EOS are with lower letter. You were probably not in EOS CLI but in the linux bash shell instead, if you go to privileged exec mode or even global config mode in EOS CLI you should be able to switch to any VRF on any EOS flavour (EOS, CVX, cEOS, cloudEOS, vEOS). e.g.:

In below example I'm connecting to enable mode in EOS CLI from the host and then entering bash and then back to EOS CLI, where you can run the cli vrf command (prior to EOS 4.23 this used to be routing-context vrf

[root@master-node ~]# docker exec -it irb_ceosPC2 Cli
PC2>en
PC2#bash

Arista Networks EOS shell

bash-4.2# cli
bash: cli: command not found
bash-4.2# Cli
PC2>en
PC2#cli ?
  vrf  Enter VRF context

PC2#cli vrf ?
  WORD     VRF name
  default  Default virtual routing and forwarding instance

With that said, as you've found using ip netns also works just fine and it's very useful in other situations as well, like connectivity tests, netstat, etc.

Regarding the outputs yes, I also had those sysctl reading lines, I just ommitted them to make it look cleaner; and regarding the bug, usually we add it to the EOS release notes, will have to check in which one did or will make it if it's not there already.

Not sure if you wanted to ask something else too, seems like your response was cut off?

Thanks,
Tamas

Post your Answer

You must be logged in to post an answer.