• Understanding subscription paths for Open-source Telemetry streaming

 
 
Print Friendly, PDF & Email

Introduction

 

The purpose of this document is to understand how the subscription paths are constructed for our openconfig connector apps (ocprometheus, ockafka, octsdb, etc.) that communicate with TerminAttr and send telemetry data to 3rd party Telemetry backends (Kafka, Prometheus, TSDB, Redis, Graphite, etc.)

All our OpenConfig connectors are publicly available and can be found on the goarista github repo:

https://github.com/aristanetworks/goarista/tree/master/cmd

Most of these OpenConfig connectors use a yaml or json file which contains the paths it is supposed to subscribe to.

Others like ockafka, ocredis don’t support paths from a file, so you have to enumerate the paths separated by a comma.

Checking the paths

There are a couple of ways to find paths:

  • Use TerminAttr’s REST API exposed locally on TCP 6060
  • Using Telemetry Browser from CVP (easy and quick)

 

1) Using TerminAttr

TerminAttr exposes it’s rest API on tcp port 6060, and we can use curl to read the paths that it is streaming.

Examples:

curl localhost:6060/rest/Sysdb/

curl localhost:6060/rest/Smash/

curl localhost:6060/rest/LANZ/

curl localhost:6060/rest/Config/

curl localhost:6060/rest/NTP/

 

2) Using the Telemetry browser in CVP

If you click on any dataset (device) in the Telemetry browser you’ll be presented with all the paths that TerminAttr is streaming and you can explore each path and find the Sysdb/Smash/NTP, etc. states 

Metric Explorer

Starting from 2018.2.2 we’ve introduced the Metric Explorer which can be helpful to determine the paths were the data is stored 

 

So how exactly are we using this paths? And how do we tag/label common state entities? Let’s find out!

Metric path structuring

Interfaces counters

 

Up until EOS 4.19.x, interface counters were found under /Sysdb/interface/counter/eth/slice/phy/<linecard>/intfCounterDir/<interface>/intfCounter/current

Starting from EOS 4.20.x, they are instead found under /Smash/counters/ethIntf/<agent>/current/counter where <agent> is the name of the platform-specific agent managing counters. Values for <agent> are:

    • 7500-family, 7280-family, 7020-family (Arad/Jericho ASICs): SandCounters
    • 7300-family, 7250-family, 7050-family, 7010 products (Trident ASICs): StrataCounters
    • For 7060-family, 7260-family (Tomahawk): Strata-FixedSystem or StrataCounters from 4.22+
    • 7150-family products (Alta ASICs): FocalPointV2
    • 7160-family products (Cavium/Xpliant ASICs): XpCounters
    • 7170-family products (Barefoot ASIC): BfnCounters

You can verify the agent name by going to bash and check the smash table for ethIntf counters

e.g. on 7260 before and after 4.22

Pre 4.22:

[admin@7260 ~]$ smash counters/ethIntf
More than one table was found. Try again using one of the following tables:
- ar/Smash/counters/ethIntf/Strata-FixedSystem/lastClear/counter
- ar/Smash/counters/ethIntf/Strata-FixedSystem/current/counter
- ar/Smash/counters/ethIntf/PhyEthtool-1/current/counter

 

Post 4.22:

[admin@gd502 ~]$ smash counters/ethIntf
More than one table was found. Try again using one of the following tables:
- ar/Smash/counters/ethIntf/StrataCounters/lastClear/counter
- ar/Smash/counters/ethIntf/StrataCounters/current/counter
- ar/Smash/counters/ethIntf/PhyEthtool-1/current/counter

 

Platform-specific path components might be removed in the future for interface counters in order to simplify things.

In case of ocprometheus, the ocprometheus.yml config file contains the subscription paths and the metric paths. Subscriptions paths tell you the root of the path tree, whereas the metric path is a specific subpath that contains the states/values of counters, DOM levels, etc.

In my first example I’m going to expand on the interface counters and explain how you can construct your metric path. 

 

In this example I’ve used a 7160 (Xpliant) switch so the agent name is XpCounters. Let’s look at the steps:

 

1. First of all, you can use this curl command to following the path tree

e.g. curl localhost:6060/rest/Smash/counters/ethIntf will list you the subpaths under /Smash/counters/ethIntf/

 

 

curl localhost:6060/rest/Smash/counters/ethIntf

{

    "PhyEthtool-1": {

        "_ptr": "/Smash/counters/ethIntf/PhyEthtool-1"

    },

    "XpCounters": {

        "_ptr": "/Smash/counters/ethIntf/XpCounters"

    },

    "name": "ethIntf"

}

 

 

2. To go further down the path you add the next folder, which in this case is XpCounters, as in this example I’m using a 7160 device (Xpliant ASIC).

 

[admin@ats324 ~]$ curl localhost:6060/rest/Smash/counters/ethIntf/XpCounters

{

    "current": {

        "_ptr": "/Smash/counters/ethIntf/XpCounters/current"

    },

    "lastClear": {

        "_ptr": "/Smash/counters/ethIntf/XpCounters/lastClear"

    },

    "name": "XpCounters"

}

 

3. From the ocprometheus.yml you can see that the next folder is “current”. This will list you the current counters for all interfaces, I’ll omit the output because it’s very long.

 

curl localhost:6060/rest/Smash/counters/ethIntf/XpCounters/current

{

    "counter": {

        "Ethernet25": {

            "counterRefreshTime": 0,

            "ethStatistics": {

                "alignmentErrors": 0,

                "carrierSenseErrors": 0,

                "deferredTransmissions": 0,

                "excessiveCollisions": 0,

                "fcsErrors": 0,

                "fragments": 0,

                "frameTooLongs": 0,

                "frameTooShorts": 0,

                "in1024To1522OctetFrames": 0,

                "in128To255OctetFrames": 153962,

                "in1523ToMaxOctetFrames": 0,

                "in256To511OctetFrames": 167,

                "in512To1023OctetFrames": 0,

                "in64OctetFrames": 5556,

                "in65To127OctetFrames": 1338313,

                "inPauseFrames": 0,

                "inPfcClassFrames": {

                    "count": [

                        0,

<ommited>

                },

                "inPfcFrames": {

                    "value": 0

                },

                "inPfcRatePerMcQ": {

                    "rate": [

                        0,
<ommited>
                },

                "inPfcRatePerQ": {

                    "rate": [

                        0,

<ommited>

                },

                "inUnknownOpcodes": 0,

                "internalMacReceiveErrors": 0,

                "internalMacTransmitErrors": 0,

                "jabbers": 0,

                "lateCollisions": 0,

                "multipleCollisionFrames": 0,

                "out1024To1522OctetFrames": {

                    "value": 9223372036854775808

                },

                "out128To255OctetFrames": {

                    "value": 9223372036854775808

                },

                "out1523ToMaxOctetFrames": {

                    "value": 9223372036854775808

                },

                "out256To511OctetFrames": {

                    "value": 9223372036854775808

                },

                "out512To1023OctetFrames": {

                    "value": 9223372036854775808

                },

                "out64OctetFrames": {

                    "value": 9223372036854775808

                },

                "out65To127OctetFrames": {

                    "value": 9223372036854775808

                },

                "outPauseFrames": 0,

                "outPfcClassFrames": {

                    "count": [

                        0,
<ommited>
                },

                "outPfcFrames": {

                    "value": 0

                },

                "outPfcRatePerPrio": {

                    "count": [

                        0,
<ommited>
                },

                "singleCollisionFrames": 0,

                "sqeTestErrors": 0,

                "symbolErrors": 0

            },

            "genId": 0,

            "key": "Ethernet25",

            "linkStatusChanges": 0,

            "rates": {

                "inBitsRate": {

                    "value": 88.31070316749646

                },

                "inPktsRate": {

                    "value": 0.06659387740513918

                },

                "outBitsRate": {

                    "value": 581.8436338321253

                },

                "outPktsRate": {

                    "value": 0.5689706698311034

                },

                "statsUpdateTime": 1547750263.297956

            },

            "statistics": {

                "inBroadcastPkts": 5720,

                "inDiscards": 0,

                "inErrors": 0,

                "inMulticastPkts": 1491952,

                "inOctets": 164649455,

                "inTotalPkts": 0,

                "inUcastPkts": 326,

                "lastUpdate": 1547750263.297956,

                "outBroadcastPkts": 247256,

                "outDiscards": {

                    "value": 21892257

                },

                "outErrors": 0,

                "outMulticastPkts": 2089696,

                "outOctets": {

                    "value": 89910206170

                },

                "outUcastPkts": 65357419

            

        },

 

4. to check a specific Ethernet interface, you have to go further down the path, but since there are no more folders after the current folder, only key-value pairs, this will be your last stop and it gets a bit trickier from here, so you’ll need to use regex

 

After this point you won’t be able to see specific values with the curl command, but I’ll try to explain how the whole path is built for the intfCounter metric:

Once again, this is our path:

/Smash/counters/ethIntf/XpCounters/current/(counter)/(?P<intf>.+)/statistics/(?P<direction>(?:in|out))(Octets|Errors|Discards)

after /Smash/counters/ethIntf/XpCounters/current/ the first thing we have to match on is the top-most dictionary key, in this case “counter“, this you can see when you run 

 

curl localhost:6060/rest/Smash/counters/ethIntf/XpCounters/current

{

    "counter": {

 

counter will be the first key element, that contains all the interfaces and we put it in paranthesis to make it a regex capturing group: (counter)

 

5. The next thing we want to focus on is to get the counters for each interface separately, so we have to match on any interface name, we can do that with regex value: .+ (dot and plus) — meaning we are matching any character one or more times, so we will match EthernetX, ManagementX, VlanX and Port-ChannelX as well.

To label these interfaces, we can give it a label name using the following expression: ?P<label_name> 

so our final name capturing regex group will be (?P<intf>.+)

Labels/tags are important on the server side, because you’ll be able to create your graphs using these labels as filters.

You can also use online regex apps that can help a lot, like www.regex101.com

See example below:


6. Now, the next thing we want to look at is the RX/TX packets, the RX/TX errors and discards, which we can find under the statistics key as seen in the dictionary:

 

            "statistics": {

                "inBroadcastPkts": 5720,

                "inDiscards": 0,

                "inErrors": 0,

                "inMulticastPkts": 1491952,

                "inOctets": 164649455,

                "inTotalPkts": 0,

                "inUcastPkts": 326,

                "lastUpdate": 1547750263.297956,

                "outBroadcastPkts": 247256,

                "outDiscards": {

                    "value": 21892257

                },

                "outErrors": 0,

                "outMulticastPkts": 2089696,

                "outOctets": {

                    "value": 89910206170

                },

                "outUcastPkts": 65357419

 

so far our path looks like the following:

 /Smash/counters/ethIntf/XpCounters/current/(counter)/(?P<intf>.+)/statistics/

 

7. From here, we want to differentiate between ingress and egress (input and output) and octets, errors and discards so we will need another regex. As you see, we have inErrors and outErrors; inOctets and outOctets, inDiscards and outDiscards

 

so we can differentiate between in and out with the following non-capturing group: (?:in|out) to either match a string starting with ‘in’ or ‘out’;

 

Example Below:

and to label it we can give it a name similarly as before with ?P<name> and put it inside a capturing group: (?P<direction>(?:in|out))

The last thing we want to match on this case are the counter types so we will use another capturing group for that, so the last regex will be 

(?P<direction>(?:in|out))(Octets|Errors|Discards)

The regex site gives you in-depth detail on the matching information:

and this is how the whole metric path was constructed

 

/Smash/counters/ethIntf/XpCounters/current/(counter)/(?P<intf>.+)/statistics/(?P<direction>(?:in|out))(Octets|Errors|Discards)

 

For any other path that we stream, you can use curl as well. The formula is:

 

curl localhost:6060/rest/ + path

 

DOM values 

Subscription path: /Sysdb/hardware/archer/xcvr/status

Interesting metrics:

  • Modular 
    • QSFP
      • RX Power
      • TX Power
    • SFP
      • RX Power
      • TX Power
  • Fixed
    • QSFP
      • RX Power
      • TX Power
    • SFP
      • RX Power
      • TX Power

 

QSFP RX Power on Modular boxes

 

Metric path:

/Sysdb/hardware/archer/(xcvr)/status/slice/(?P<linecard>.+)/(?P<intf>.+)/domRegisterData/lane(?P<lane>\d)(OpticalRxPower)

 

Let’s see how this is constructed. 

 

We have 4 regular expressions in this metric

  1. (xcvr)
  2. (?P<linecard>.+)
  3. (?P<intf>.+)
  4. (?P<lane>\d)(OpticalRxPower)

 

1. (xcvr) is a capturing group, where we capture the text matched by the regex inside the paranthesis  into a numbered group that can be reused with a numbered backreference later. This is optional in this case as we are not doing anything with the “xcvr” string

2. (?P<linecard>.+) is a named capture group that matches any characters except line terminators one or more times, so we will match all subpaths under /Sysdb/hardware/archer/xcvr/status/slice

 

In this case we will match Linecard3, Linecard4, Linecard5 and we’ll create a label called linecard so later we can filter on it in our graphing system

 

[admin@tg227 ~]$ curl localhost:6060/rest/Sysdb/hardware/archer/xcvr/status/slice

{

    "Linecard3": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard3"

    },

    "Linecard4": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard4"

    },

    "Linecard5": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard5"

    

}

 

When in doubt always check it on regex101.com

As you can see we match any character, not just the ones that contain the string of “linecard

 

3. (?P<intf>.+) is a named capture group, that creates a label called “intf” and matches any characters except line terminators one or more times, so we will match all subpaths under each Linecard’s folder, so we will match any interface under any Linecard

 

/Sysdb/hardware/archer/xcvr/status/slice/Linecard3

/Sysdb/hardware/archer/xcvr/status/slice/Linecard4

/Sysdb/hardware/archer/xcvr/status/slice/Linecard5

 

[admin@tg227 ~]$ curl localhost:6060/rest/Sysdb/hardware/archer/xcvr/status/slice/Linecard3

{

    "Ethernet3/1": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard3/Ethernet3/1"

    },

    "Ethernet3/10": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard3/Ethernet3/10"

    },

    "Ethernet3/11": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard3/Ethernet3/11"

    },

    "Ethernet3/12": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard3/Ethernet3/12"

    },

<ommited>

 

[admin@tg227 ~]$ curl localhost:6060/rest/Sysdb/hardware/archer/xcvr/status/slice/Linecard4

{

    "Ethernet4/1": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard4/Ethernet4/1"

    },

    "Ethernet4/10": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard4/Ethernet4/10"

    },

<ommited>

 

[admin@tg227 ~]$ curl localhost:6060/rest/Sysdb/hardware/archer/xcvr/status/slice/Linecard5

{

    "Ethernet5/1": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard5/Ethernet5/1"

    },

    "Ethernet5/10": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard5/Ethernet5/10"

    },

    "Ethernet5/11": {

        "_ptr": "/Sysdb/hardware/archer/xcvr/status/slice/Linecard5/Ethernet5/11"

    },

 

 

4. lane(?P<lane>\d)(OpticalRxPower) a string that starts with “lane” followed by a name capturing group with label name “lane” followed by a digit, marked with \d and followed by the string “OpticalRxPower”, so basically 

 

  • laneXOpticalRxPower, where X=1,2,3,4 (and we separate these each so we can graph them separately later)

 

For example Ethernet3/2 on tg227 is a  40GBASE-SR4. The DOM values will be under the domRegisterData key and here’s how they look like:

 

[admin@tg227 ~]$ curl localhost:6060/rest/Sysdb/hardware/archer/xcvr/status/slice/Linecard3/Ethernet3/2

{

<ommited>

"domRegisterData": {

        "lane1OpticalRxPower": -30,

        "lane1TxBias": 0,

        "lane1TxPower": -30,

        "lane2OpticalRxPower": -3.6794783329418976,

        "lane2TxBias": 0,

        "lane2TxPower": -30,

        "lane3OpticalRxPower": -4.0882404968820865,

        "lane3TxBias": 7.2620000000000005,

        "lane3TxPower": -0.6248210798265319,

        "lane4OpticalRxPower": -3.007694971165911,

        "lane4TxBias": 0,

        "lane4TxPower": -30,

        "temperature": 24.46875,

        "voltage": 3.2854

}

 

Checking on regex101, we can see we are matching the OpticalRxPower for all 4 lanes

/Sysdb/hardware/archer/(xcvr)/status/slice/(?P<linecard>.+)/(?P<intf>.+)/domRegisterData/lane(?P<lane>\d)(OpticalRxPower)

 

How it looks like on the server side

Usually Open-source Telemetry systems have a query language, based on which users can create custom graphs based on labels.

Example 1 – Rx Power for QSFP on Fixed systems

The Metric Path is:

/Sysdb/hardware/archer/(xcvr)/status/all/(?P<intf>.+)/domRegisterData/lane(?P<lane>\d)(OpticalRxPower)

You can filter on the 

  • named capturing groups: intf and lane
  • unnamed capturing groups: unnamedLabel1 for xcvr and unnameLabel4 for OpticalRxPower
  • Traget IPs aka instance (switch IP)

 

See below examples:

Example 2 – Rx/Tx Power for SFPs on Fixed Systems

Example 3 – interface counters

Graph examples on Grafana 

Data streamed from octsdb to Graphite

 

Grafana has its own query language too and you can connect your various backends to it and graph your metrics using queries.

 

In the below example I’m reading the temperature values for Ethernet24 using the following expression:

 

seriesByTag(‘name=eos.xcvr.temperature’,’intf=Ethernet24′)

 

In Grafana, you can either write the queries/expressions or use the wizard and select the key/values from the dropdown

Data streamed from ocprometheus to Prometheus

 

In the below example I’m plotting the 1m aggregate for RX Octets counters for interface Ethernet3/1 using the following expression:

 

 

rate(intfCounter{job="arista",instance="172.28.160.232:8080",intf="Ethernet3\\/1",type="Octets"}[1m])*8

 

For more information on how to setup ocprometheus please visit: https://eos.arista.com/streaming-eos-telemetry-states-to-prometheus/

 

Sample Configs

TerminAttr

No VRF

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -grpcaddr 0.0.0.0:6042 -disableaaa -allowed_ips 10.83.13.78/32
  no shutdown  
!

 

VRF

!
daemon TerminAttr
  exec /usr/bin/TerminAttr -grpcaddr management/0.0.0.0:6042 -disableaaa -allowed_ips 10.83.13.78/32
  no shutdown
!

Ocprometheus

No VRF

!
daemon ocprometheus
exec /usr/bin/ocprometheus -config /mnt/flash/ocprometheus.yml -addr localhost:6042
no shutdown
!

 

VRF

!

daemon ocprometheus
exec /sbin/ip netns exec ns-management /usr/bin/ocprometheus -config /mnt/flash/ocprometheus.yml -addr localhost:6042
no shutdown
!

 

Ockafka

No VRF

!
daemon ockafka
  exec /mnt/flash/ockafka -addrs 10.83.13.139:<6042 -kafkaaddrs 10.83.13.76:9092 -kafkatopic test -subscribe /Sysdb/environment/archer/temperature/status/system/,/Kernel/proc/cpu
  no shutdown
!

 

VRF

!
daemon ockafka
  exec /sbin/ip netns exec ns-management /mnt/flash/ockafka -addrs 10.83.13.139:6042 -kafkaaddrs 10.83.13.76:9092 -kafkatopic test -subscribe /Sysdb/environment/archer/temperature/status/system/,/Kernel/proc/cpu
  no shutdown
!

Octsdb

No VRF

!
daemon octsdb
  exec /mnt/flash/octsdb -addr 10.83.13.139:6042 -config /mnt/flash/sampleconfig.json -tsdb 10.83.37.97:2003
  no shutdown
!

VRF

!
daemon octsdb
  exec /sbin/ip netns exec ns-management /mnt/flash/octsdb -addr 10.83.13.139:6042 -config /mnt/flash/sampleconfig.json -tsdb 10.83.37.97:2003
  no shutdown
!

 

Useful Links:

https://regex101.com/

https://www.rexegg.com/regex-quickstart.html

https://www.regular-expressions.info/refcapture.html

https://www.regular-expressions.info/named.html

Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: