• Onboarding a switch in CVP

 
 
Print Friendly, PDF & Email

Description

This article will talk about how to onboard a switch in CVP 2019.1.x/2020.1.x and will deep-dive into the process involved during the registration process. In addition, we will also include the troubleshooting steps that can be taken in case the registration process fails. 

Platform compatibility

This feature is supported on all platforms.

Configuration

On the Switch:

To enable the onboarding process, we will need to first enable command-api on the switch so that the switch is able to communicate with CVP via eAPI. This can be done in the following way:

Arista#configure 
Arista(config)#management api http-commands 
Arista(config-mgmt-api-http-cmds)#no shut
Arista(config-mgmt-api-http-cmds)#show active
management api http-commands
    no shutdown

By default communication via HTTPS is enabled, to enable communication via HTTP:

Arista(config-mgmt-api-http-cmds)# protocol http

For devices running prior to 4.20.x will in addition require the below configuration:

Arista(config-mgmt-api-http-cmds)# protocol unix-socket 

Show Commands

Show management api-http commands”  will provide information if eAPI is enabled on the switch in addition to the other information such as protocol used, port number, VRF, user logged in, URLs etc as shown below:

Arista(config-mgmt-api-http-cmds)#show management api http-commands 
Enabled:            Yes
HTTPS server:       running, set to use port 443
HTTP server:        shutdown, set to use port 80
Local HTTP server:  shutdown, no authentication, set to use port 8080
Unix Socket server: shutdown, no authentication
VRFs:               default
Hits:               18
Last hit:           5382 seconds ago
Bytes in:           4192
Bytes out:          5495
Requests:           16
Commands:           44
Duration:           4.564 seconds
SSL Profile:        none
FIPS Mode:          No
QoS DSCP:           0
Log Level:          none
CSP Frame Ancestor: None
TLS Protocols:      1.0 1.1 1.2
   User           Requests       Bytes in       Bytes out    Last hit         

-------------- -------------- -------------- --------------- ---------------- 

   cvpadmin       16             4192           5495         5382 seconds ago 

URLs                                  

-------------------------------------     

Vlan320     : https://10.81.45.33:443 

The URL above can be used to test if eAPI is working by just pasting the URL on a web browser as shown below:

 

The above page will prompt for the username and password used to login into the switch. Once the username/password is put in, you will be able to run commands on the switch via eAPI as show below:

How to onboard a switch using CVP GUI?

1. Before onboarding, kindly check if the device is running the below supported version

Note: If you are on 2019.1.x and adding a switch with EOS 4.23.x or on 2020.1.x and adding a switch with EOS 4.24.x, you will see this error saying, EOS version is too high.

This notice is typically harmless and should not cause an impact. Moreover, this is just a warning from Telemetry’s perspective that EOS 4.23+/4.24+ isn’t *fully* supported. There are some sysdb/smash paths that have changed between 4.22 and 4.23 and above, hence, there’s some limited Telemetry data that may show up as N/A. However, 95% of telemetry states should still be fine.

Workaround:
To get rid of the event alerts saying that the EOS version is too high you can do the following in the CVP CLI (Command Line Interface Mode) :

a) Edit the /cvpi/apps/turbine/configs/event-unsupported-version-max-eos.yml and change MinorVersion: 22 to MinorVersion: 24 (this can be changed to any higher number)
b) Then restart the turbine by doing cvpi stop turbine-event-unsupported-version-max-eos && cvpi start turbine-event-unsupported-version-max-eos

2. The switch can be onboarded into CVP using CVP GUI. To do so:

  •  Navigate to the devices tab
  • On the top right corner you should see a tab that states “Add Device”
  • Then click on the Onboard Device option 
  • Add IP address or Hostname and click on Register.
  • Once the device is registered successfully, you should be able to see the device under the provisioning tab under the undefined container.

What goes on during the onboarding process?

1.  Once the IP address is added and register is clicked, CVP opens a TLS session using port 443 as shown in the packet capture below:

2. On the CVP server, we can see the onboarding process under the /cvpi/apps/cvp/logs/inventory.stderr.log

For example, you would see this in the logs:
I0914 18:26:19.209471       8 inventory.go:192] Onboarding device: 10.81.45.33()
I0914 18:26:20.353096       8 inventory_helper.go:701] Successfully mapped 74:83:ef:c9:20:89 to 10.81.45.33

3. CVP logs into the switch using the same username/password (same credentials used to login into the CVP GUI) and checks through which VRF it is reachable from the switch. This is done by running routing-context vrf <vrf_name> to change the context to that vrf in order to add it under the TerminAttr configuration (-ingestvrf=<vrf_name>)

daemon TerminAttr
   exec /usr/bin/TerminAttr -ingestgrpcurl=10.81.45.117:9910 -cvcompression=gzip -ingestauth=key, -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -ingestvrf=default -taillogs
   no shutdown

4. CVP also checks the current TerminAttr version if it >= 1.6.1 which is the minimum supported version for 2019.1.x and 2020.1.x CVP versions. If the switch is running a version lower that 1.6.1, CVP will install TerminAttr version 1.6.1 to continue with the registration process. Minimum supported TerminAttr version are as shown below:

5. It will then telnet to the CVP IP port 9910 from the switch to check if the port is open and if the connection is possible.

6. Once, telnet is successful, CVP will start and configure TerminAttr based on a hardcoded yaml script found under /cvpi/apps/cvp/conf/onboardconfig.yaml and will establish a TCP session on port 9910.

7. On the CVP Server, we should see the below logs which states that the device is actively streaming and being added to provisioning. It then checks the existence of an undefined container and moves the device to it.

I0914 18:27:06.800079       8 inventory.go:736] Device SSJ18272718 is actively streaming
I0914 18:27:06.800122       8 inventory_helper.go:1267] Waiting for device SSJ18272718 to have valid macAddress
I0914 18:27:09.297862       8 requestHandler.go:109] Adding device SSJ18272718 to provisioning
I0914 18:33:08.385332       8 processor.go:294] Response &{10.81.45.33_onboardDevice 2020-09-14 18:32:59.226 +0000 UTC 0001-01-01 00:00:00 +0000 UTC processing Device is now streaming, adding it to provisioning  0 SSJ18272718  0} successfully written
I0914 18:33:08.396121       8 processor.go:294] Response &{mapToContainer_SSJ18272718 2020-09-14 18:33:08.384595539 +0000 UTC 0001-01-01 00:00:00 +0000 UTC processing   0   0} successfully written
I0914 18:33:08.396204       8 inventory.go:391] Mapping device SSJ18272718 to container undefined_container
I0914 18:33:08.396219       8 inventory.go:400] Verifying existence of device with key SSJ18272718
I0914 18:33:08.398508       8 inventory.go:406] Existence of device with key SSJ18272718 successfully verified
I0914 18:33:08.398529       8 inventory.go:423] Verifying existence of container with key undefined_container
I0914 18:33:08.401013       8 inventory.go:430] Existence of container with key undefined_container successfully verified
I0914 18:33:08.411624       8 inventory_helper.go:571] Upserting provisioned device &{SSJ18272718 74:83:ef:c9:20:89 undefined_container Registered   false false 10.81.45.33} to the datastore
I0914 18:33:08.422774       8 inventory.go:489] Device SSJ18272718 successfully mapped to container undefined_container
I0914 18:33:08.428900       8 processor.go:294] Response &{mapToContainer_SSJ18272718 2020-09-14 18:33:08.384595539 +0000 UTC 0001-01-01 00:00:00 +0000 UTC success   0   0} successfully written
I0914 18:33:08.429252       8 inventory_helper.go:929] Mapped device SSJ18272718 to container undefined_container

8. Once this step is completed, the device should show as successfully registered.

9. Under the switch running-configuration you should see daemon TerminAttr automatically created as shown below:

daemon TerminAttr
   exec /usr/bin/TerminAttr -ingestgrpcurl=10.81.45.117:9910 -cvcompression=gzip -ingestauth=key, -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -ingestvrf=default -taillogs
   no shutdown

10. You can see the device streaming data to port 9910 as shown in the pcap below:

In some cases, if the management IP is not used for the registration process and instead a non-physical interface such as a SVI or a loopback interface is used and CVP is not directly reachable via the switch, the telnet process might break as we do not specify the source-interface while performing telnet to CVP.

In this case we can specify the telnet source-interface in EOS using the below command:

ip telnet client source-interface Loopback0 vrf RED

We will need to also include the -cvsourceip=<Source IP address> under the Daemon TerminAttr configuration as shown below:

daemon TerminAttr
   exec /usr/bin/TerminAttr -ingestgrpcurl=10.81.45.117:9910 -cvcompression=gzip -ingestauth=key, -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -ingestvrf=default -taillogs -cvsourceip=10.81.45.33
   no shutdown

Or

Instead of adding the flag manually we can add it to the default configuration during registration using the below process:
1. You’ll need to modify the /cvpi/apps/cvp/conf/onboardconfig.yaml file and set the specifySourceIP variable to True as below:

terminAttrBinaryPath: /usr/bin/TerminAttr 
ingestExclude: 
    - /Sysdb/cell/1/agent 
    - /Sysdb/cell/2/agent 
smashExcludes: 
    - ale 
    - flexCounter 
    - hardware 
    - kni 
    - pulse 
    - strata 
cvCompression: gzip 
tailLogs: True 
specifySourceIP: True 
hostSpecificConfig: 
    sampleHost: 
        cvVRF: sampleVRF 
ipFix: True 
ipFixAddr: 127.0.0.1:4739 
ipFixAddrDomain: default 
sFlow: True 
sFlowAddr: 127.0.0.1:6343 
sFlowAddrDomain: default 

 2.  Then restart the inventory component by running the following:

cvpi stop inventory && cvpi start all

Troubleshooting onboarding/registration failures

1. Switch unreachable via eAPI 

a) This would mean that eAPI on the switch is turned off or is not working. You can check this by running “show management api http-commands”, it should show if its running or now as shown below:

Arista(config-mgmt-api-http-cmds)#show management api http-commands 
Enabled:            No
HTTPS server:       enabled, set to use port 443
HTTP server:        shutdown, set to use port 80

b) This can be enabled by issuing a no shutdown command under management api http-commands

Arista(config-mgmt-api-http-cmds)#no shutdown

c) If it still doesn’t work after enabling eAPI on the switch. Try running the URL https://10.81.45.33:443 from the browser as mentioned earlier in the document.

d) Check if the management interface is in the correct VRF for physical connectivity to the switch by issuing a ping to the CVP Server

Arista(config-mgmt-api-http-cmds)#ping 10.81.45.117
PING 10.81.45.117 (10.81.45.117) 72(100) bytes of data.
80 bytes from 10.81.45.117: icmp_seq=1 ttl=64 time=0.358 ms
80 bytes from 10.81.45.117: icmp_seq=2 ttl=64 time=0.110 ms
80 bytes from 10.81.45.117: icmp_seq=3 ttl=64 time=0.086 ms
80 bytes from 10.81.45.117: icmp_seq=4 ttl=64 time=0.073 ms
80 bytes from 10.81.45.117: icmp_seq=5 ttl=64 time=0.095 ms

e) Check if telnet or port 9910 is not being blocked by the Firewall

Arista(config)#telnet 10.81.45.117 9910
Trying 10.81.45.117...
Connected to 10.81.45.117.
Escape character is 'off'.

f) Check for any ACLs applied on the switch under management api http-command configuration

g) If all of the above is working as expected, this should help resolve the issue.

2. Unauthorized user

a) The way CVP works is that CVP logs into the switch using the same username/password used to login into the CVP GUI.  Hence, this would mean that the username used is not configured on the switch locally or if TACACS or RADIUS is used to authenticate the switch login, then the username should be present on the TACACS/RADIUS server.

b) Check if enable password is configured and if aaa authorization is configured as well or not, if it’s not configured there are high chances the user is authenticated into enable mode only.

c) Adding the username locally or on the TACACS/RADIUS server using the correct role and privilege level should help resolve the issue.

On EOS:

(config)#username cvpadmin role network-admin privilege 15 secret arista

d) This can be verified using command “show management api http-command” as shown below (in this case cvpadmin was used to login into the GUI):

CSP Frame Ancestor: None
TLS Protocols:      1.0 1.1 1.2
User     Requests       Bytes in       Bytes out    Lasthit          
cvpadmin   21             5487           6693         9906 seconds ago 

 

3. EOF

a) First thing to check here would be to see if any ACLs are applied under “management api http-commands” in EOS. This can be done by running “show run sec management api http-commands”:

management api http-commands
   no shutdown
   !
   vrf default
     ip access-group test

In this case, we can see an access-list applied to default VRF which denies all IP communication, resulting in this issue.

Arista(config-mgmt-api-http-cmds)#show ip access-lists test
IP Access List test
        10 deny ip any any

4. No route to host

a) This error will be observed when the switch cannot reach the CVP Server or vice-versa.

b) Check the route on both the devices.

c) Adding a route on the respective devices should help resolve the issue.

5. Unable to reach CVP from the device in any VRF

a) This error occurs when CVP is unreachable via the outgoing interface

b) Check if telnet is successful, if not, try changing the routing-context using command:

“Cli vrf <management name>”

c) If it fails, it probably means that telnet is not being sourced from the correct interface. The below command can be used to test:

 telnet 10.83.13.33 9910 /source-interface loopback0

d) If this helps resolve the issue, please configure the following to help prevent this issue from occurring again

 ip telnet client source-interface Loopback0 vrf management

6. “Error received from device” and “Timed out waiting for response from device”

a) This means the switch is non-existent/unreachable.

b) Check for ping connectivity from the switch.

c) Check for telnet connectivity from the switch to CVP.

d) Check for ACLs applied on the interface towards the CVP Server.

 

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: