Same issue as described by someone else in: https://eos.arista.com/forum/issue-with-symmetric-irb/ The original post was marked as “resolved”, but with no indication as to *how* it was resolved. Please advise?
I see the same behavior the most recent releases (4.24.2F / 188.8.131.52F)
I can actually see it working correctly end-to-end, but only for about 10-packets or so, after which it reverts to behavior described by the original poster.
Any assistance greatly appreciated.
There's a known issue which will be fixed in an upcoming EOS release that you might be hitting ( bug481983 ) if you're using multiple VRFs
In a cEOS-lab with dynamic EVPN Type 5 route exhange or static EVPN route configuration, a host cannot reach another host in a different subnet. EOS maps the tenant VRF to a internal IP-less VLAN. The Linux kernel enables RPF check on all the interfaces by default. Because the internal VLAN used by EVPN is IP-less, it drops all IP packet by default.
The workaround is to disable the Linux RPF check for all the internal VLANs used by EVPN.
Step1: Determine the list of the EVPN internal vlans:
#sh vlan dynamic | grep evpn evpn 4094
Alternative, you can find the vlan ID from vrf name -> VNI -> vlan ID
VTEP1(vrf:bedrock)#sh running-config interfaces vxlan 1 interface Vxlan1 vxlan source-interface Loopback1 vxlan udp-port 4789 vxlan vlan 750 vni 10750 vxlan vlan 751 vni 10751 vxlan vrf bedrock vni 10000 <<<<< VTEP1(vrf:bedrock)#sh interfaces vxlan 1 Vxlan1 is up, line protocol is up (connected) Hardware is Vxlan Source interface is Loopback1 and is active with 184.108.40.206 Replication/Flood Mode is headend with Flood List Source: EVPN Remote MAC learning via EVPN VNI mapping to VLANs Static VLAN to VNI mapping is [750, 10750] [751, 10751] Dynamic VLAN to VNI mapping for 'evpn' is [4094, 10000] <<<<< Note: All Dynamic VLANs used by VCS are internal VLANs. Use 'show vxlan vni' for details. Static VRF to VNI mapping is [bedrock, 10000] Headend replication flood vtep list is: 750 220.127.116.11 MLAG Shared Router MAC is 0000.0000.0000
Step 2: List the RPF value
VTEP2(vrf:bedrock)(config)#bash Arista Networks EOS shell bash-4.2# sysctl -a | grep rp_filter | grep 4094 net.ipv4.conf.vlan4094.rp_filter = 2////////In case of not-working
Step3: Disable RPF for each EVPN internal vlan on for each VTEP:
bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/vlan4094/rp_filter bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/all/rp_filter
Step4: Verify RPF is reset properly
bash-4.2# sysctl -a | grep vlan4094.rp_filter net.ipv4.conf.vlan4094.rp_filter = 0////////////////For working ping
Let us know if that helps!
I see the dynamic VLAN mapping, but in the bash shell there's noting* configured for the dynamic VLAN interface in /proc/sys/net. sysctl -ar doesn't return anything for the vlan in question, and /proc/sys/net/ipv4/conf doesn't have an directory for the dynamic vlan.
TPT-LSWA10-020921#sh vxlan vni source evpn
VNI to dynamic VLAN Mapping for Vxlan1
Arista Networks EOS shell
bash-4.2# sysctl -ar vlan4049
Any other suggestions/advice??
It seems like you are in the default vrf, can you switch context to the IP VRF and go to bash after that and check the sysctl commands?
bash-4.2# sysctl -a | grep rp_filter | grep 4094 sysctl: reading key "net.ipv6.conf.all.stable_secret" sysctl: reading key "net.ipv6.conf.cpu.stable_secret" sysctl: reading key "net.ipv6.conf.default.stable_secret" sysctl: reading key "net.ipv6.conf.eth0.stable_secret" sysctl: reading key "net.ipv6.conf.eth1.stable_secret" sysctl: reading key "net.ipv6.conf.eth3.stable_secret" sysctl: reading key "net.ipv6.conf.fabric.stable_secret" sysctl: reading key "net.ipv6.conf.fwd0.stable_secret" sysctl: reading key "net.ipv6.conf.lo.stable_secret" sysctl: reading key "net.ipv6.conf.lo0.stable_secret" sysctl: reading key "net.ipv6.conf.lo1.stable_secret" sysctl: reading key "net.ipv6.conf.txraw.stable_secret" sysctl: reading key "net.ipv6.conf.vx1.stable_secret" sysctl: reading key "net.ipv6.conf.vxlan.stable_secret" bash-4.2# exit logout
Leaf2#cli vrf tenant-blue Leaf2(vrf:tenant-blue)#bash Arista Networks EOS shell bash-4.2# sysctl -a | grep rp_filter | grep 4094 sysctl: reading key "net.ipv6.conf.all.stable_secret" sysctl: reading key "net.ipv6.conf.default.stable_secret" sysctl: reading key "net.ipv6.conf.fwd0.stable_secret" sysctl: reading key "net.ipv6.conf.lo.stable_secret" sysctl: reading key "net.ipv6.conf.vlan1515.stable_secret" sysctl: reading key "net.ipv6.conf.vlan4094.stable_secret" sysctl: reading key "net.ipv6.conf.vlan50.stable_secret" sysctl: reading key "net.ipv6.conf.vlan55.stable_secret" sysctl: reading key "net.ipv6.conf.vlan60.stable_secret" net.ipv4.conf.vlan4094.arp_filter = 0 net.ipv4.conf.vlan4094.rp_filter = 1 bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/vlan4094/rp_filter bash-4.2# echo "0" > /proc/sys/net/ipv4/conf/all/rp_filter
Created a quick demo using docker-topo which might be also useful:
Issue resolved for me :) A couple of stray observations in case anyone else is running into the issue:
(1) I think you're running a pretty different version of cEOS than I am, because: (a) you use "cli" while I need to use "Cli", and I can't invoke the cli with a vrf argument; and (2) your sysctl output is different from mine (I get a bunch of spurious "sysctl: reading..." lines in my output. I wasn't sure if you were just grooming your output, or if you have a really different version you're working from.
(2) Since I couldn't invoke Cli with the vrf argument, I just invoked bash shell specifying the corresponding netns. ("ip netns list" from cEOS root-level bash or CLI-invoked bash --either one seemed to work just fine. Then I raun "ip netns exec [netns-name] bash") (The naming pattern seems to be to prepend the EOS-configured VRF name with "ns-" when CEOS creates the associated netns.)
(3) Myself and others had observed that when this defect was active, we *would* see the incoming traffic when performing a tcpdump on the dynamic VLAN mapped to the L3VNI. I notice now that I *don't* see the traffic on the same tcpdumps (but now I *do* see it actually egressing the destination VLAN and actually working end-to-end, so I'm not complaining.)
And seriously, thanks again. Having a viable CEOS platform for testing EVPN symmetric IRB in conjunction with the recently viable routed dot1q subinterfaces moves cEOS from "toy" to "extremely useful tool" in my environments.
Out of curiosity, where can I find the bug (bug481983 ) you mentioned documented? Also, is
Glad to hear you were able to make it work and find it useful!
I was using the same EOS version as you did. Is there an error message that you were getting? Note that cli commands in EOS are with lower letter. You were probably not in EOS CLI but in the linux bash shell instead, if you go to privileged exec mode or even global config mode in EOS CLI you should be able to switch to any VRF on any EOS flavour (EOS, CVX, cEOS, cloudEOS, vEOS). e.g.:
In below example I'm connecting to enable mode in EOS CLI from the host and then entering bash and then back to EOS CLI, where you can run the
[root@master-node ~]# docker exec -it irb_ceosPC2 Cli PC2>en PC2#bash Arista Networks EOS shell bash-4.2# cli bash: cli: command not found bash-4.2# Cli PC2>en PC2#cli ? vrf Enter VRF context PC2#cli vrf ? WORD VRF name default Default virtual routing and forwarding instance
With that said, as you've found using ip netns also works just fine and it's very useful in other situations as well, like connectivity tests, netstat, etc.
Regarding the outputs yes, I also had those sysctl reading lines, I just ommitted them to make it look cleaner; and regarding the bug, usually we add it to the EOS release notes, will have to check in which one did or will make it if it's not there already.
Not sure if you wanted to ask something else too, seems like your response was cut off?
Post your Answer
You must be logged in to post an answer.