• How to build and install DPDKCap

 
 
Print Friendly, PDF & Email

Introduction

DPDKCap is high performance packet capture tool based on DPDK. This guide explains how to build, install and use DPDKCap on a CentOS 7 based system.

Arista Fork : https://github.com/aristanetworks/dpdkcap

Assumptions

  • CentOS 7 Linux
  • NVMe capture drive (not mandatory but recommended for line rate capture)
  • Running as root user
  • CPU & NIC combination that supports DPDK

System used to validate performance

  • Manufacturer: Supermicro
  • Part number: SYS-E300-8D
  • Processor: Intel Xeon CPU D-1518
  • Memory: 2x Micron 9ASF1G72PZ-2G3A1 8GB DIMMs
  • HDD: Samsung 860 PRO SSD 4TB
  • NVMe: Samsung 960 EVO 1TB

Build steps

Create a directory at /data and format and mount the NVMe drive:

# Make a mount location
>$ mkdir -p /data
 
# Create a partition on the HDD
>$ fdisk /dev/nvme0n1
# Choose “n” to create a new partition
# Then "p" and "1" for new partition
# Finally, select "w" to write data to disk
 
# Format the new partition
>$ mkfs -t ext4 /dev/nvme0n1p1
 
# Find the UUID of the new partition (/dev/nvme0n1p1)
>$ blkid

/dev/nvme0n1p1: UUID="fe209393-3547-46c4-bb63-63825ac2db3f" TYPE="ext4"

 
# Add the partition to /etc/fstab (backup first)
>$ cp /etc/fstab /etc/fstab-backup
>$ echo -e "UUID=fe209393-3547-46c4-bb63-63825ac2db3f\t/data\text4\tdefaults\t0\t0" >> /etc/fstab

 

Configure the system for DPDK by modifying the kernel command line:

Hugepages
---------
 
Before using DPDK some setup is required. Firstly, your operating system must be configured to use hugepages. To
do this on a CentOS system add the following to the end of your boot time kernel command line and then reboot your
system:
 
  hugepagesz=1G hugepages=4 intel_iommu=off
 
Additional performance options
------------------------------
 
To gain the best performance with DPDKCap it is also advisable to disable extraneous processes and reserve several cores
for use by DPDKCap only. This ensures that DPDKCap has exclusive use of the CPU cores that it is given, preventing packet
drops due to interruptions from other processes. The easiest way to do this is to enforce core isolation and prevent IRQs
on the cores that DPDKCap will be given, which can be achieved using the isolcpus and irqaffinity arguments on the kernel
command line.
 
It is also advisable on newer linux kernels to turn off any spectre and meltdown mitigations within the kernel, as these can
heavily impact performance.
 
An example set of kernel command line arguments is as follows. This example turns off SELinux, disables the intel CPU power
management, and isolates CPU cores 1-7, leaving only core 0 for use by the OS. It also disables various spectre and meltdown
mitigations.
 
selinux=0 audit=0 tsc=reliable intel_idle.max_cstate=0 processor.max_cstate=0 isolcpus=1-7 nohz_full=1-7 rcu_nocbs=1-7 irqaffinity=0 nospec_store_bypass_disable noibrs noibpb spectre_v2_user=off spectre_v2=off nopti l1tf=off kvm-intel.vmentry_l1d_flush=never mitigations=off
 
Note: These kernel command line arguments were tested on CentOS 7.7.1908 (kernel version 3.10.0-1062.9.1).
 
# Backup the grub configuration files
>$ cp /etc/default/grub /etc/default/grub-backup
>$ cp /boot/grub2/grub.cfg /boot/grub2/grub.cfg-backup
 
# Edit the grub config and add 'selinux=0 audit=0 tsc=reliable intel_idle.max_cstate=0 processor.max_cstate=0 hugepagesz=1G hugepages=4 intel_iommu=off isolcpus=1-7 nohz_full=1-7 rcu_nocbs=1-7 irqaffinity=0 nospec_store_bypass_disable noibrs noibpb spectre_v2_user=off spectre_v2=off nopti l1tf=off kvm-intel.vmentry_l1d_flush=never mitigations=off idle=poll' to the end of the GRUB_CMDLINE_LINUX variable
 
>$ vim /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="nofb splash=quiet crashkernel=auto rd.lvm.lv=vg00/root rd.lvm.lv=vg00/swap biosdevname=0 net.ifnames=0 rhgb quiet selinux=0 audit=0 tsc=reliable intel_idle.max_cstate=0 processor.max_cstate=0 hugepagesz=1G hugepages=4 intel_iommu=off isolcpus=1-7 nohz_full=1-7 rcu_nocbs=1-7 irqaffinity=0 nospec_store_bypass_disable noibrs noibpb spectre_v2_user=off spectre_v2=off nopti l1tf=off kvm-intel.vmentry_l1d_flush=never mitigations=off idle=poll"
GRUB_DISABLE_RECOVERY="true"
 
# Regenerate grub config files (it seems this command can't be run from NFS mounted home directories without errors due to being root and root-squash on the mount, hence the cd to the root directory first)
 
>$ cd /
>$ grub2-mkconfig -o /boot/grub2/grub.cfg
 
# Reboot so that the new boot parameters take effect
>$ reboot

 

Install the version of DPDK recommended by dpdkcap (19.11.0 at the time of writing) as follows (based on http://doc.dpdk.org/guides-19.11/linux_gsg/build_dpdk.html):

# Download DPDK
>$ wget http://fast.dpdk.org/rel/dpdk-19.11.tar.xz
 
# Install dependencies
>$ yum groupinstall "Development Tools"
>$ yum install numactl-devel kernel-devel-$(uname -r)
 
# Unpack, build and install DPDK into the filesystem
>$ tar -xJf dpdk-19.11.tar.xz
>$ cd dpdk-19.11/
>$ make install T=x86_64-native-linux-gcc DESTDIR=/usr/local
 
DPDK configuration
------------------
 
# Install the DPDK kernel modules and configure to load on boot
>$ cp -R /usr/local/lib/modules/$(uname -r)/extra/* /lib/modules/$(uname -r)/extra/
>$ depmod
>$ modprobe igb_uio
>$ modprobe vfio_pci
>$ echo -e "igb_uio\nvfio_pcie" > /etc/modules-load.d/dpdk.conf
 
Next it is necessary to bind the interfaces that you wish to use with DPDKCap to the DPDK poll mode drivers. Note that this will
remove these interfaces from normal kernel usage, so the associated ports (e.g. eth0) will no-longer show in ifconfig unless
they use a bifurcated driver.
 
To do this, first list the kernel devices:
 
  dpdk-devbind --status
 
This will show a list of devices which are using the kernel driver. Identify the devices which you would like to use
with DPDKCap from this list and make a note of their PCIe device IDs (the first column in the list). Then to bind these
devices to dpdk's userspace driver run:
 
  dpdk-devbind -b igb_uio  [ ...]
 
To confirm that these devices have been correctly bound, show the dpdk-devbind status again and check that the selected
devices now show up in the list of devices using a dpdk-compatible driver:
 
  dpdk-devbind --status
 
# Ensure that /usr/local/bin and /usr/local/sbin are on the root user's PATH (in the local session, and in future sessions)
>$ export PATH=/usr/local/sbin:/usr/local/bin:$PATH
>$ vim /root/.bash_profile
# After any current PATH modifications, add PATH=/usr/local/sbin:/usr/local/bin:$PATH, and make sure that there is an 'export $PATH' statement afterwards
# Mount the huge pages and make the mount permanent
>$ mkdir /mnt/huge_1GB
>$ mount -o pagesize=1GB -t hugetlbfs nodev /mnt/huge_1GB
>$ echo -e "nodev\t/mnt/huge_1GB\thugetlbfs\tpagesize=1GB\t0\t0" >> /etc/fstab

 

Build and install the latest master version of DPDKCap as follows (based on the instructions in the DPDKCap README):

# Download dpdkcap
>$ wget https://github.com/aristanetworks/dpdkcap/archive/master.zip
 
# Install dependencies
>$ yum install ncurses-devel
 
# Export required environment variables
>$ export RTE_SDK=
>$ export RTE_TARGET=x86_64-native-linux-gcc
 
# Unpack, build and install dpdkcap into the filesystem
>$ unzip master.zip
>$ cd dpdkcap-master/
>$ make
>$ cp build/app/* /usr/bin

 

Follow the DPDKCap README for more information on how to use DPDKCap and the various command line parameters (remove –help from the example below to start capture on port 1).

# dpdkcap -c 0xe -n 4 -- -p 0x1 -w /data/test -t -z -S --help

Usage: dpdkcap [OPTION...]
A DPDK-based packet capture tool
 
  -b, --burst_size=NUM       Size of receive burst (default: 128)
  -d, --rx_desc=DESC_MATRIX  This option can be used to override the default
                             number of RX descriptors configured for all queues
                             of each port (1024). RX_DESC_MATRIX can have
                             multiple formats:
                             - A single positive value, which will simply
                             replace the default  number of RX descriptors,
                             - A list of key-values, assigning a configured
                             number of RX descriptors to the given port(s).
                             Format:
                                  := . { ","
                             . "," ...
                                     := {  |  }
                                :=  "-" 
                               Examples:
                               512               - all ports have 512 RX desc
                             per queue
                               0.256, 1.512      - port 0 has 256 RX desc per
                             queue,
                                                   port 1 has 512 RX desc per
                             queue
                               0-2.256, 3.1024   - ports 0, 1 and 2 have 256 RX
                             desc per  queue,
                                                   port 3 has 1024 RX desc per
                             queue.
  -i, --mbuf_len=MBUF_LEN    Size (in bytes) of each MBUF (packet buffer).
                             Recommened value is 2KB + RTE_PKTMBUF_HEADROOM
                             (default: (2048 + 128))
  -j, --pbuf_len=PBUF_LEN    Size (in bytes) of each PBUF (pcap buffer).
                             Optimal values, are powers of 2 (2^q) (default:
                             1024 * 1024 * 128)
      --logs=FILE            Writes the logs into FILE instead of stderr.
  -m, --nb-mbuf=NB_MBUF      Number of memory buffers per core per port used to
                             store the DMA'd packets by the nic driver. Optimal
                             values, are powers of 2 (2^q) (default: 65536)
  -n, --nb_pbuf=NB_PBUF      Number of memory buffers per core per port used to
                             store received packets before being flushed to
                             disk. Optimal values, are powers of 2 (2^q)
                             (default: 4)
  -p, --portmask=PORTMASK    Ethernet ports mask (default: 0x1).
  -q, --nb_queues_per_port=QUEUES_PER_PORT
                             Number of queues per port (default: 1)
  -S, --stats                Print stats every few seconds.
  -t, --mw-timestamp         Use MetaWatch trailer timestamps.
  -w, --output=FILE          Output FILE template (don't add the extension).
                             Use "%COREID" for inserting the lcore id into the
                             file name (automatically added if not used).
                             (default: output_%COREID)
  -z, --flow-control         Enable flow control.
  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version
 
Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.
# dpdkcap -c 0xe -n 4 -- -p 0x2 -w /data/test -t -z -S
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No free hugepages reported in hugepages-2048kB
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: PCI device 0000:04:00.0 on NUMA socket 0
EAL: probe driver: 8086:15ac net_ixgbe
EAL: PCI device 0000:04:00.1 on NUMA socket 0
EAL: probe driver: 8086:15ac net_ixgbe
EAL: PCI device 0000:07:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
EAL: PCI device 0000:0b:00.0 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:0b:00.1 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:0b:00.2 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:0b:00.3 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
DPDKCAP: Using 1 ports to listen on
DPDKCAP: Cores/Queues Per Port: 1 Burst Size: 128
DPDKCAP: MBufs: Num: 65536 Len: 2176 B PBufs: Num: 4 Len: 134217728 B
DPDKCAP: RX Burst Len: 278528 Watermark: 133939200
DPDKCAP: Flow control: ON Pause Burst Size: 128
DPDKCAP: Use MetaWatch trailer timestamps: ON
DPDKCAP: Disk (259:0) block size = 512
DPDKCAP: Using 3 cores out of 3 allocated
DPDKCAP: Port 1: MAC=0c:c4:7a:97:ef:35, RXdesc/queue=1024
DPDKCAP: MTU Info : Max = 15854B, Min = 68B
DPDKCAP: TX Desc Info : Max = 4096, Min = 32, Multiple of 8
DPDKCAP: TX Queue Info : Max = 64
DPDKCAP: RX Desc Info : Max = 4096, Min = 32, Multiple of 8
DPDKCAP: RX Queue Info : Max = 128
DPDKCAP: Launching capture process: worker=0, port=1, core=2, queue=0
DPDKCAP: Core 2 is capturing packets for port 1
DPDKCAP: Launching write process: worker=0, port=1, core=3, queue=0
DPDKCAP: Core 3 is writing using file template: /data/test_%COREID.pcap.
^CDPDKCAP: Caught signal Interrupt on core 1 (MASTER CORE)
DPDKCAP: Closed capture core 2 (port 1)
DPDKCAP: Waiting for all cores to exit
DPDKCAP: Closed writing core 3
Follow

Get every new post on this blog delivered to your Inbox.

Join other followers: