• Streaming EOS telemetry states to InfluxDB

 
 
Print Friendly, PDF & Email

Introduction

The aim of this document is to help you deploy and configure InfluxDB, Grafana, and Arista EOS, allowing you to send Telemetry states from the Arista switch to InfluxDB, using one of our OpenConfig connector application octsdb that you can find on our GitHub page. Please note, that these apps were written as a proof-of-concept and are supported on a best-effort basis. You can fork the project and edit based on your requirements. Feedbacks are always welcome and issues can be filed like for any other projects on GitHub.

Both OpenTSDB and InfluxDB are time-series databases. Where OpenTSDB is a scalable, distributed Time Series Database written in Java and built on top of HBase.
InfluxDB is an open-source Time Series Database written in Go.

octsdb is capable of communicating with TerminAttr and sending telemetry data to 3rd party applications like OpenTSDB and InfluxDB (which also supports OpenTSDB’s protocol and is our primary focus in this article).

Prerequisite

The following tools are required to proceed with this setup including cloning the repository and compiling octsdb for EOS.

Installing InfluxDB and Grafana

There are different ways to install InfluxDB and Grafana, for this article we will be using a quick and easy docker container installation.

Copy the following dockerfile to your system which has docker daemon running and save it as docker-compose.yaml

version: "3"

services:
 influxdb:
   container_name: influxdb
   environment:
     INFLUXDB_DB: grpc
     INFLUXDB_ADMIN_USER: "admin"
     INFLUXDB_ADMIN_PASSWORD: "arista"
     INFLUXDB_USER: tac
     INFLUXDB_USER_PASSWORD: arista
     INFLUXDB_RETENTION_ENABLED: "false"
     INFLUXDB_OPENTSDB_0_ENABLED: "true"
     INFLUXDB_OPENTSDB_BIND_ADDRESS: ":4242"
     INFLUXDB_OPENTSDB_DATABASE: "grpc"
   ports:
     - '8086:8086'
     - '4242:4242'
     - '8083:8083'
   networks:
     - monitoring
   volumes:
     - influxdb_data:/var/lib/influxdb
   command:
     - '-config'
     - '/etc/influxdb/influxdb.conf'
   image: influxdb:latest
   restart: always

 grafana:
   container_name: grafana
   environment:
     GF_SECURITY_ADMIN_USER: admin
     GF_SECURITY_ADMIN_PASSWORD: arista
   ports:
     - '3000:3000'
   networks:
     - monitoring
   volumes:
     - grafana_data:/var/lib/grafana
   image: grafana/grafana
   restart: always

networks:
 monitoring:

volumes:
 influxdb_data: {}
 grafana_data: {}

Verify the docker file:

$ docker-compose config

Start the containers

$ docker-compose up -d

Verify the containers are up and running

$ docker ps --format "{{.ID}} | {{.Names}} | {{.Status}} | {{.Ports}}"
780c1fd17f7e | grafana | Up 32 minutes | 0.0.0.0:3000->3000/tcp
4b09561e57b6 | influxdb | Up 32 minutes | 0.0.0.0:4242->4242/tcp, 0.0.0.0:8083->8083/tcp, 0.0.0.0:8086->8086/tcp

To stop the containers and delete the docker volumes you can use

$ docker-compose down -v

 

If not using docker installation

The dockerfile automatically takes care of enabling OpenTSDB listener in InfluxDB, however, if you are not using docker installation you can enable OpenTSDB listener in InfluxDB using below steps:

Edit the /etc/influxdb/influxdb.conf to include the following:

[opentsdb]
  enabled = true
   bind-address = ":4242"
   database = "grpc"

Restart the InfluxDB service

$ systemctl restart influxdb.service

Setup the database

  • “create database grpc”
  • “create user <DBUser> with password <DBPassword>”

Installing and Configuring octsdb for EOS

Pull the repository from GitHub (or you can use git clone)

$ go get github.com/aristanetworks/goarista/cmd/octsdb

Go to the octsdb directory (or the directory to which you have cloned the repo using git clone)

$ cd $GOPATH/src/github.com/aristanetworks/goarista/cmd/octsdb

Compile the package for EOS

$ GOOS=linux GOARCH=386 go build

NOTE: For EOS with x86_64 architecture, compile the package as follows:

$ GOOS=linux GOARCH=amd64 go build

Copy the binary file to switch /mnt/flash/ directory

$ scp $GOPATH/src/github.com/aristanetworks/goarista/cmd/octsdb/octsdb admin@<switch-MGMT-IP>:/mnt/flash/

Octsdb configuration file

Octsdb requires a JSON configuration file which contains the paths it is supposed to subscribe to. Following is a very simple configuration file (most paths have been removed for brevity):

{
   "comment": "TerminAttr Parser to OpenTSB",
   "subscriptions": [
       "/Kernel/proc/cpu/utilization"
   ],
   "metricPrefix": "eos",
   "metrics": {
       "totalCpu": {
           "path": "/Kernel/proc/(cpu)/(utilization)/(total)/(?P<type>.+)"
       },
       "coreCpu": {
           "path": "/Kernel/proc/(cpu)/(utilization)/(.+)/(?P<type>.+)"
       }
   }
}

For details about subscription paths and metric path structures please visit: https://eos.arista.com/understanding-subscription-paths-for-open-source-telemetry-streaming

Sample configuration files from the official GitHub page:

  1. Below EOS 4.20
  2. Above EOS 4.20

Configuring TerminAttr and octsdb daemon

Default VRF with CVP

!
daemon TerminAttr
   exec /usr/bin/TerminAttr -ingestgrpcurl=<CVP-IP>:9910 -cvcompression=gzip -taillogs -ingestauth=key,magickey -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -disableaaa -grpcaddr 0.0.0.0:6042
   no shutdown
!
daemon octsdb
   exec /mnt/flash/octsdb -addr <switch-mgmt-ip>:6042 -config /mnt/flash/sampleConfig.json -tsdb 10.85.129.115:4242
!

Default VRF without CVP

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -disableaaa -grpcaddr 0.0.0.0:6042
  no shutdown
!
daemon octsdb
   exec /mnt/flash/octsdb -addr <switch-mgmt-ip>:6042 -config /mnt/flash/sampleConfig.json -tsdb 10.85.129.115:4242
!

VRF management with CVP

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -ingestgrpcurl=<CVP-IP>:9910 -cvcompression=gzip -taillogs -ingestvrf=management -ingestauth=key,magickey -smashexcludes=ale,flexCounter,hardware,kni,pulse,strata -ingestexclude=/Sysdb/cell/1/agent,/Sysdb/cell/2/agent -disableaaa -grpcaddr management/0.0.0.0:6042
  no shutdown
!
daemon octsdb
  exec /sbin/ip netns exec ns-management /mnt/flash/octsdb -addr <switch-mgmt-ip>:6042 -config /mnt/flash/sampleConfig.json -tsdb 10.85.129.115:4242
  no shutdown
!

VRF management without CVP

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -disableaaa -grpcaddr management/0.0.0.0:6042
  no shutdown
!
daemon octsdb
  exec /sbin/ip netns exec ns-management /mnt/flash/octsdb -addr <switch-mgmt-ip>:6042 -config /mnt/flash/sampleConfig.json -tsdb 10.85.129.115:4242
  no shutdown
!

VRF management without CVP and authentication

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -grpcaddr management/0.0.0.0:6042
  no shutdown
!
daemon octsdb
  exec /sbin/ip netns exec ns-management /mnt/flash/octsdb -addr <switch-mgmt-ip>:6042 -config /mnt/flash/sampleConfig.json -username=cvpadmin -password=arista -tsdb 10.85.129.115:4242
  no shutdown
!

In the above examples the -tsbd flag specifies the address of the OpenTSDB/InfluxDB server where to push telemetry to.
NOTE: For EOS x86_64 architecture, use the x86_64 TerminAttr swix file, available on our website.

Flags for TerminAttr

disableaaa: Disable AAA checking – all AAA requests pass (when not using username and password in octsdb configuration)

grpcaddr string: VRF and address to listen on to serve data using the gNMI interface. The expected form is [<vrf-name>/]address:port (default “127.0.0.1:6042”)

Flags for octsdb

config: Path to the JSON config file
tsdb: Address of the OpenTSDB/InfluxDB server.
username: Username to authenticate with (if using authentication in TerminAttr)
password: Password to authenticate with (if using authentication in TerminAttr)

Verifying the Telemetry data in InfluxDB

Connect to the InfluxDB database using below command (this is for docker deployment, for other deployments you can avoid the docker exec command):

$ docker exec -it influxdb bash
root@4b09561e57b6:/# influx -precision 'rfc3339'
Connected to http://localhost:8086 version 1.8.0
InfluxDB shell version: 1.8.0
>
> show databases
name: databases
name
----
grpc
_internal
>
> use grpc
Using database grpc
>
> show measurements
name: measurements
name
----
eos.corecpu.cpu.utilization._counts
eos.corecpu.cpu.utilization.cpu.0
eos.corecpu.cpu.utilization.cpu.1
eos.corecpu.cpu.utilization.cpu.2
eos.corecpu.cpu.utilization.cpu.3
eos.corecpu.cpu.utilization.total
>
> SELECT * FROM "eos.corecpu.cpu.utilization.cpu.0" ORDER BY DESC LIMIT 5
name: eos.corecpu.cpu.utilization.cpu.0
time                 host          type   value
----                 ----          ----   -----
2020-04-29T07:00:49Z 10.85.128.117 idle   98689474
2020-04-29T07:00:49Z 10.85.128.117 nice   1863
2020-04-29T07:00:49Z 10.85.128.117 system 891768
2020-04-29T07:00:49Z 10.85.128.117 user   4889342
2020-04-29T07:00:49Z 10.85.128.117 util   6

Configuring Grafana

First, we need to add InfluxDB as a data source. To do this go to the Configuration (Gear Icon) then to Data Sources

Click on Add Data Source button

Select InfluxDB

Fill out the form, required fields are:
URL: http://influxdb:8086 OR http://<server-IP>:8086
Database: grpc
User: admin (username specified during InfluxDB setup)
Password: arista (password specified during InfluxDB setup)
It should look something like below

Click on Save and Test

Creating Dashboards

Click on the Add (+) Icon and Create:

You will see the following New Panel or you can add one by clicking on

Click on Add Query and then make sure you have “InfluxDB” as your data source

Click on select measurement and choose the measurement:

You can build the Query rules, for example:

And add multiple visualizations with different measurements in your dashboard:

Troubleshooting

The daemon/agent logs are stored in /var/log/agents directory on the Arista switch.
From the switch CLI you can check the logs using the below command:

# show agent octsdb logs

 

Or from bash shell, you can use cat/more/less/vi/nano/tail

# bash cat /var/log/agents/<agent-log-file>

 

The Authentication failure message indicates that octsdb is not able to connect to the gRPC server that TerminAttr is serving, either because the disableaaa flag is not specified in the TerminAttr config or the username and password strings are incorrect in the octsdb config.

===> /var/log/agents/octsdb-13194 Wed Apr 29 13:08:02 2020 <===
===== Output from /mnt/flash/octsdb ['-addr', '10.85.128.117:6042', '-config', '/mnt/flash/sampleConfig.json', '-tsdb', '10.85.129.115:4242'] (PID=13194) started Apr 29 13:08:02.179891 ===
F0429 13:08:02.369270   13194 main.go:114] rpc error: code = Unauthenticated desc = Authentication failed

 

The connection refused error, which means that the gRPC server is not reachable. In this case, octsdb is executed in the management VRF, however, TerminAttr is running in the default VRF and the gRPC server by default is running in the default VRF too. To fix it you can configure the gRPC server in the correct VRF.

===> /var/log/agents/octsdb-13622 Wed Apr 29 13:11:49 2020 <===
===== Output from /sbin/ip ['netns', 'exec', 'ns-management', '/mnt/flash/octsdb', '-addr', '10.85.128.117:6042', '-config', '/mnt/flash/sampleConfig.json', '-tsdb', '10.85.129.115:4242'] (PID=13622) started Apr 29 13:11:49.322222 ===
F0429 13:11:49.363874   13622 main.go:114] rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:6042: connect: connection refused"

 

The following error is seen when octsdb is not able to connect to the OpenTSDB endpoint. Check the remote server if InfluxDB is running, OpenTSDB listener is enabled and Firewall is allowing the port 4242.

===> /var/log/agents/octsdb-14445 Wed Apr 29 13:20:26 2020 <===
===== Output from /mnt/flash/octsdb ['-addr', '10.85.128.117:6042', '-config', '/mnt/flash/sampleConfig.json', '-tsdb', '10.85.129.115:4242'] (PID=14445) started Apr 29 13:19:45.372118 ===
I0429 13:20:15.453509   14445 main.go:210] Element 7: map[value:0] is map[string]interface {}, not json.Number
I0429 13:20:15.453966   14445 main.go:161] Failed to put datapoint: dial tcp 10.85.129.115:4242: connect: connection refused
I0429 13:20:15.454252   14445 main.go:161] Failed to put datapoint: dial tcp 10.85.129.115:4242: connect: connection refused

 

You can also increase octsdb agent logging verbosity by using the -v flag:

daemon octsdb
   exec /mnt/flash/octsdb -addr 10.85.128.117:6042 -config /mnt/flash/sampleConfig.json -tsdb 10.85.129.115:4242 -v 9

Useful links

https://grafana.com/docs/

https://docs.influxdata.com/influxdb/v1.8/

Example Configuration files

https://github.com/aristanetworks/goarista/tree/master/cmd/octsdb/sample_configs

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: