- Get the extension:
- Build and install for EOS
- Enable eAPI on the switch(es):
- Edit the configuration:
- Optionally, test your SNMP configuration from a shell:
- Start rphm:
- Example trap message:
- Save extensions and config:
- Other related extensions:
Remote Port Health Manager (rphm) monitors interface counters on one or more EOS devices. It will send an SNMP trap to a management station whenever one of those counters increases at a rate greater than the defined threshold. Further, rphm is easily extensible so other actions could be added.
Example uses include:
- I want to know when one of my critical ports gets more than N number of CRC errors during a window of time.
- Am I receiving excessive rxPause frames from certain paths, and if so, when?
- When is there a burst in broadcast or multicast packets being transmitted from a certain port?
- Trigger a warning when critical link utilization nears saturation.
- Get proactive alerts on your network monitoring system when any selected interface counter grows faster than desired.
- Run on-switch as an extension or on a separate monitoring server.
- Define devices, interfaces, and statistics to be monitored as well as thresholds and poll frequency in the simple config file.
- Supports SNMP V2c and V3.
- Implemented actions: snmptrap. Others may be easily added in the source.
Get the extension:
- Arista eAPI (EOS 4.12 or later)
- Rpmbuild tools on linux are required to build the extension. If installed on a non-EOS linux system, net-snmp and jsonrpclib are also required.
Build and install for EOS
Build the extension:
bash $git clone https://github.com/arista-eosext/rphm.git bash $cd rphm/ bash $make rpm
Copy to the switch and install:
Arista#copy scp://user@buildhost/<path>/rphm/rpmbuild/rphm-1.0.0-1.rpm extensions: Arista#extension stat-mon-<ver>.rpm
Enable eAPI on the switch(es):
Arista(config)#username <name> privilege 15 secret <password> Arista(config)#management api http-commands Arista(config-mgmt-api-http-cmds)#no shutdown
Edit the configuration:
Arista#bash [admin@Arista ~]$sudo vi /persist/sys/rphm.conf
Configure [snmp] settings:
Configure the traphost and uncomment, then configure the appropriate lines for SNMP V2c or V3.
Configure [counters] poll interval:
This is the time to wait between successive polls.
Configure [switches] settings:
For a single switch deployment, add the switch’s hostname or IP to the “switchList” in the [switches] section. Then in the [DEFAULT] section, set the eAPI connection info, such as username and password and the default interfaceList.
If configuring multiple switches, use the [DEFAULT] section for common items. Then copy the portions of the [DEFAULT] section that need to be unique to new sections where the section name is the hostanme, ip, or friendly name of the switch(es). See the config file for examples.
Configure per counter thresholds
Also in the [DEFAULT] section, with the ability to override in per-switch sections, set the counterList which defines which counters will be monitored and adjust the threshold level for the desired counters.
# rphm.conf # [snmp] traphost = snmp-traphost.example.com # SNMP v2: version = 2c community = eosplus [counters] # Seconds between polls poll = 300 [DEFAULT] #protocol=https #port=443 #hostname=localhost #username=arista #password=arista #url = %(protocol)s://%(username)s:%(password)s@%(hostname)s:%(port)s/command-api # The default list of interfaces to monitor on any switch interfaceList="Management1", "Ethernet1", "Ethernet2" # The default list of counters to monitor on each interface. # NOTE: a threshold must be defined for each counter. counterList=totalInErrors, totalOutErrors, fcsErrors, symbolErrors # Default thresholds: totalInErrors=20 totalOutErrors=20 alignmentErrors=1 fcsErrors=1 # Simple method for defining the switch(es) to monitor with default options. switchList=10.10.10.11, localhost, spine-l3-04.example.com # In a multi-switch monitoring setup, defaults may be overridden on a per-switch basis #[vEOS-1] #hostname=10.10.10.100 #password="different-pass" #counterList=inUcastPkts, # inDiscards #inUcastPkts = 4000000
Optionally, test your SNMP configuration from a shell:
The following command will send a single test trap to your traphost:
[admin@Arista ~]$/usr/bin/rphm [--config=<path-to>/my.conf] [--debug] --test=trap
Arista(config)#daemon rphm Arista(config-daemon-stat-mon)#command /usr/bin/rphm Arista(config-daemon-stat-mon)#exit
Example trap message:
Rphm uses the enterprise-specific, generic trap OID:
.iso.org.dod.internet.private.arista.generic (.126.96.36.199.4.1.30065.6) string
“Device my-switch-02 DCS-7048T-4S-R, interface Ethernet2: fcsErrors increasing at > 1 per 30 seconds. Found 9/3284 packets in”
Save extensions and config:
In order for rphm to run after a reload, save the configuration and extensions.
Arista#copy installed-extensions boot-extensions Arista#copy running-config startup-config
An related extension, the Port Health Monitor script generates syslog notifications
whenever the FCS/symbol errors counters on an interface exceed pre-configured levels, then, optionally, will shutdown interfaces with high error rates over consecutive poll intervals. The Port Health Monitor thresholds measure change in the error counters since the script starts or the interface comes up. Rphm, on the other hand, compares changes in counters since the last poll interval.