This article shows an experiment which demonstrates what happens when an EOS agent is killed (either voluntarily, or as a result of a failure).
In the frame of this article, and for testing purpose only, we take few preparation steps that are not absolutely necessary in a production environment:
1) Clear the logs to improve visibility (less noise in the logs)
2) Configure high resolution logging timestamps
Arista (config)#logging format timestamp high-resolution
Killing an agent
1) Access the bash shell
2) Select a process and find its PID (Process ID)
[admin@Arista ~]$ ps –ef | grep -i Stp ps -A | grep Stp 1661 ? 00:00:00 StpTopology 4160 ? 00:00:01 Stp
This example list the Spanning-Tree processes.
3) Kill the process
[admin@Arista ~]$sudo kill 4160
4) Verify that the new process restarted – notice the new PID for the Stp process
[admin@Arista ~]$ps -A | grep Stp 1661 ? 00:00:00 StpTopology 4378 ? 00:00:00 Stp
5) Check the syslog portion related to the agent restart
[admin@s7151 ~]$sudo tail /var/log/messages
2014-01-23T00:12:23.375251+00:00 s7151 ProcMgr-worker: %PROCMGR-6-PROCESS_TERMINATED: 'Stp' (PID=4158) has terminated. 2014-01-23T00:12:23.375906+00:00 s7151 ProcMgr-worker: %PROCMGR-6-PROCESS_RESTART: Restarting 'Stp' immediately (it had PID=4158) 2014-01-23T00:12:23.382093+00:00 s7151 ProcMgr-worker: %PROCMGR-6-PROCESS_STARTED: 'Stp' starting with PID=4376 (PPID=1601) -- execing '/usr/bin/Stp' 2014-01-23T00:12:24.529032+00:00 s7151 Stp: %SPANTREE-6-INTERFACE_ADD: Interface Ethernet19 has been added to instance MST0
The chain of events described by the logs from above is:
- 12:23.375251: When the command to kill the Stp agent is initiated, the Stp agent with process ID 4158 is immediately terminated.
- 12:23.375906 (less than 650µs later): The Watchdog process within the ProcMgr detects the Stp agent died, and automatically restarts the Stp agent.
- 12:23.382093 (6.1ms later): – Stp agent is restarted and this time is assigned a process id of 4376.
- Notice that from the time the agent was killed to the time the agent was restarted it only look 6.8 milliseconds (12:23.375251 —> 12:23.382093).
- 12:24.529032: Stp agent is running automatically on interface Eth19, which is a switchport.