• Configure and Troubleshoot DNS on EOS

 
 
Print Friendly, PDF & Email

What is DNS?

The Domain Name Server (DNS) maps FQDN labels to IP addresses and provides addresses for network devices.

How to install it on EOS switch?

EOS has a pre-installed DNSmasq service that can be used as a lightweight DNS server. The switch can be used in the recursive mode to take the local and external requests and forward them to an upstream DNS server. 

Each network requires at least one server to resolve addresses. The configuration file can list a maximum of three server addresses.

Configuration

Step 1: In EOS CLI configure a name-server:

ip name-server vrf default 8.8.8.8

 

Step 2: If the switch is being used as a recursive DNS server, configure the below command:

SW(config)#ip domain proxy

This will set the external-proxy=1 in /etc/dnsmasq.conf.

 

Step3: Check if this is reflected in the DNSmasq config.

SW#bash cat /etc/dnsmasq.conf

# NOTE NOTE NOTE NOTE NOTE

#

# This file is AUTO-GENERATED based on the system's configuration.

# Any modifications you make to this file will be lost when the

# system's configuration is changed, e.g. from the CLI.

#

no-resolv

server-namespace=default

listen-namespace=ns-outside1,default

server=8.8.8.8

external-proxy=1

 

Step4: Update the control-plane ACL to allow DNS requests, by default it has been dropped. This can be achieved by replacing the default-control-plane ACL with a new user-defined ACL.

The default-control-plane ACL can be checked using the below command:

SW#show ip access-lists default-control-plane-acl 

IP Access List default-control-plane-acl [readonly]

        statistics per-entry

        10 permit icmp any any [match 2634650, 0:25:30 ago]

        20 permit ip any any tracked [match 1482890, 0:23:20 ago]

        30 permit udp any any eq bfd ttl eq 255

        40 permit udp any any eq bfd-echo ttl eq 254

        50 permit ospf any any

        60 permit tcp any any eq ssh telnet www snmp bgp https msdp [match 663, 1:29:45 ago]

        70 permit udp any any eq bootps bootpc snmp rip ntp [match 149108, 0:50:25 ago]

        80 permit tcp any any eq mlag ttl eq 255

        90 permit udp any any eq mlag ttl eq 255

        100 permit vrrp any any

        110 permit ahp any any

        120 permit pim any any

        130 permit igmp any any

        140 permit tcp any any range 5900 5910

        150 permit tcp any any range 50000 50100

        160 permit udp any any range 51000 51100

 

Step 5: Copy default-control-plane-acl to a text file and add statements 170 and 180 to permit DNS traffic. After editing, ACL should look like this:

        statistics per-entry

        10 permit icmp any any

        20 permit ip any any tracked

        30 permit udp any any eq bfd ttl eq 255

        40 permit udp any any eq bfd-echo ttl eq 254

        50 permit ospf any any

        60 permit tcp any any eq ssh telnet www snmp bgp https msdp

        70 permit udp any any eq bootps bootpc snmp rip ntp

        80 permit tcp any any eq mlag ttl eq 255

        90 permit udp any any eq mlag ttl eq 255

        100 permit vrrp any any

        110 permit ahp any any

        120 permit pim any any

        130 permit igmp any any

        140 permit tcp any any range 5900 5910

        150 permit tcp any any range 50000 50100

        160 permit udp any any range 51000 51100

        170 permit tcp any any eq domain

        180 permit udp any any eq domain 

        

Step 6: Apply the modified ACL to a  new control-plane ACL, as the default is read-only. Alternatively, if there is already a custom user-defined control plane ACL configured, ensure that DNS traffic is allowed in the same. If not, please add statements 170/180. 

SW(config)#ip access-list control-plane

SW(config-acl-control-plane)#statistics per-entry

SW(config-acl-control-plane)#10 permit icmp any any

SW(config-acl-control-plane)#20 permit ip any any tracked

SW(config-acl-control-plane)#30 permit udp any any eq bfd ttl eq 255

SW(config-acl-control-plane)#40 permit udp any any eq bfd-echo ttl eq 254

SW(config-acl-control-plane)#50 permit ospf any any

SW(config-acl-control-plane)#60 permit tcp any any eq ssh telnet www snmp bgp https msdp

SW(config-acl-control-plane)#70 permit udp any any eq bootps bootpc snmp rip ntp

SW(config-acl-control-plane)#80 permit tcp any any eq mlag ttl eq 255

SW(config-acl-control-plane)#90 permit udp any any eq mlag ttl eq 255

SW(config-acl-control-plane)#100 permit vrrp any any

SW(config-acl-control-plane)#110 permit ahp any any

SW(config-acl-control-plane)#120 permit pim any any

SW(config-acl-control-plane)#130 permit igmp any any

SW(config-acl-control-plane)#140 permit tcp any any range 5900 5910

SW(config-acl-control-plane)#150 permit tcp any any range 50000 50100

SW(config-acl-control-plane)#160 permit udp any any range 51000 51100

SW(config-acl-control-plane)#170 permit tcp any any eq domain

SW(config-acl-control-plane)#180 permit udp any any eq domain

SW(config-acl-control-plane)#exit

 

Step 7:  Apply the new ACL at the control-plane.

SW(config)#control-plane 

SW(config-cp)#ip access-group control-plane in

SW(config-cp)#exit

 

Step 8: Configure your client to use the Arista switch as the DNS server

 

Verification

Check the ACL statistics using the below command to verify that the packets are matching to the DNS ACL rules:

SW(config)#show ip access-lists control-plane

IP Access List control-plane

        statistics per-entry

        10 permit icmp any any

        20 permit ip any any tracked [match 734, 0:00:00 ago]

        30 permit udp any any eq bfd ttl eq 255

        40 permit udp any any eq bfd-echo ttl eq 254

        50 permit ospf any any

        60 permit tcp any any eq ssh telnet www snmp bgp https msdp

        70 permit udp any any eq bootps bootpc snmp rip ntp [match 4, 0:00:02 ago]

        80 permit tcp any any eq mlag ttl eq 255

        90 permit udp any any eq mlag ttl eq 255

        100 permit vrrp any any

        110 permit ahp any any

        120 permit pim any any

        130 permit igmp any any

        140 permit tcp any any range 5900 5910

        150 permit tcp any any range 50000 50100

        160 permit udp any any range 51000 51100

        170 permit tcp any any eq domain

        180 permit udp any any eq domain [match 165, 0:00:02 ago]

 

Troubleshooting

Issue: Unable to perform the resolution. 

SW#bash nslookup arista.com

;; connection timed out; trying next origin

;; connection timed out; no servers could be reached

1.Ping the DNS server to check the reachability

SW#ping 8.8.8.8

If name-servers are configured under a VRF, ping using the VRF name.

SW#ping vrf mgmt 8.8.8.8

 

2.Check the IP route to the DNS server to identify the exit interface for the packets.

SW#show ip route 8.8.8.8

VRF: default

---snipped---

 S        8.8.8.8/32 [1/0] via 172.28.160.1, Management1

 

3.Check if telnet to the DNS server on port 53 is successful

SW#telnet 8.8.8.8 53 /source-interface lo0

Trying 8.8.8.8...

Connected to 8.8.8.8.

Escape character is 'off'.

Connection closed by foreign host.

Note: The source-interface needs to be specified in the telnet command only if the switch is forced to use a source-interface for originating the DNS requests using the below command:

SW(config)#ip domain lookup source-interface Loopback 0

 

4.Take a packet capture on the exit interface(as per the output in point2) while attempting to resolve the name

SW#bash nslookup www.google.com

SW# bash tcpdump -nevvi ma1 port 53 -w /mnt/flash/dns.pcap

In the packet capture, confirm if the query is sent by the switch and a reply is received. If the communication is incomplete, verify the side where the communication is broken.

 

5.Check if the ACLs are correctly configured for both TCP and UDP DNS traffic.

 

6.Run the below command to check inodes of the LISTEN sockets

SW#bash

 Arista Networks EOS shell

[admin@SW ~]$ sudo su

bash-4.3# ip -all netns exec netstat -aunlpte | grep :53 | grep -i listen

The command output should look like below if the switch is used as a regular DNS server

SW#bash sudo  ip -all netns exec netstat -aunlpte | grep :53 | grep -i listen

Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name

tcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN      0          20101562   13671/dnsmasq

tcp6       0      0 ::1:53                  :::*                    LISTEN      0          20101564   13671/dnsmasq

If the switch is used as a recursive DNS server, the output should look like below:

SW#bash sudo  ip -all netns exec netstat -aunlpte | grep :53 | grep -i listen

Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name

tcp        0      0 0.0.0.0:53              0.0.0.0:*               LISTEN      0          19312749   17461/dnsmasq

tcp6       0      0 :::53                   :::*                    LISTEN      0          19312751   17461/dnsmasq

Note: As a regular DNS server, the switch will Listen on localhost IP i.e 127.0.0.0 and port 53 and when used as a recursive DNS server, it will listen on 0.0.0.0 and port 53

 

7.If from the above output, it is determined that the DNS service is not listening on port 53, check the status of the dnsmasq service. 

SW#bash sudo systemctl status dnsmasq

* dnsmasq.service - SYSV: This script starts your DNS caching server

   Loaded: loaded (/etc/rc.d/init.d/dnsmasq; bad; vendor preset: disabled)

   Active: inactive (dead)

     Docs: man:systemd-sysv-generator(8)

% 'sudo systemctl status dnsmasq' returned error code: 3

 

If the service is not running, start the service using the below command and check the status again.

SW#bash sudo systemctl start dnsmasq

SW#bash sudo systemctl status dnsmasq

* dnsmasq.service - SYSV: This script starts your DNS caching server

   Loaded: loaded (/etc/rc.d/init.d/dnsmasq; bad; vendor preset: disabled)

   Active: active (running) since Fri 2020-12-18 20:26:10 UTC; 2s ago

     Docs: man:systemd-sysv-generator(8)

  Process: 5716 ExecStart=/etc/rc.d/init.d/dnsmasq start (code=exited, status=0/SUCCESS)

 Main PID: 4220 (dnsmasq)

   CGroup: /system.slice/dnsmasq.service

           > 4220 /usr/sbin/dnsmasq

 

8.Check System logs  for any suspicious logs reported by the switch

SW#show logging system | grep dnsmasq

2020 Aug 25 17:50:53 sw.aristanetworks.com dnsmasq[12929]: FAILED to start up

2020 Aug 25 17:50:55 sw.aristanetworks.com dnsmasq[12962]: failed to create listening socket for port 53: Address already in use

 

9.If the ‘port already in use’ error is seen, you can try to kill the process 

SW# bash sudo kill -9 <pid>

SW#bash sudo kill -9 12962

Note: By killing the PID, the service will be inactive and a manual restart of the service will be needed using the below command:

SW#bash sudo service dnsmasq restart

Shutting down dnsmasq:                                     [  OK  ]

Starting dnsmasq:                                          [  OK  ]

 

If the issue is still seen, collect the below outputs and reach out to Arista TAC support by sending an email at support@arista.com

bash commands:

​SW# bash sudo tar -czvf /mnt/flash/TAC-$HOSTNAME-HistShowTech-logs.tar.gz /mnt/flash/schedule/tech-support/* 
​SW# bash sudo tar -czvf /mnt/flash/TAC-$HOSTNAME-agent_logs.tar.gz /var/log/agents/* 
​SW# bash sudo tar -czvf /mnt/flash/TAC-$HOSTNAME-logging_system.tar.gz /var/log/messages* 
SW# bash sudo lsof -i -P -n | grep LISTEN 
SW# bash sudo ps aux | grep dns 
SW# bash sudo netstat -tulpn | grep :53 
SW# bash sudo netstat -tulp | grep domain 
SW# bash sudo ip -all netns exec netstat -aunlpte 
SW# bash sudo ip netns exec default netstat -aunlpte 
SW# bash sudo ip netns exec mgmt netstat -aunlpte >> In case DNS is configured in VRF 
SW# bash sudo 'lsof'  
​SW# bash sudo systemctl status -a
Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: