| Analyzing Network PerformanceKrazyWorks

Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells, Featured, Networking, Performance

Analyzing Network Performance

Submitted by Igor on November 25, 2019 – 9:37 pm

Much of network performance analysis will be comparative in nature. Thus, seeing the output of multiple commands side by side can be quite useful. Bash has a useful little utility called pr and we’ll make use of it.

Side-by-side traceroutes

Step 1: create hostlist.txt and populate it with IPs or hostnames of the servers you want to traceroute. Example:

# cat hostlist.txt
google.com
yahoo.com
yandex.ru
att.com

Step 2: declare some functions and variables

export GREP_COLOR='1;37;41'
awk_func() {
  sed 's/.*\*/* * */g' | awk '{print $2,$3}'
}

Step 3: parse the hostlist.txt and generate traceroute commands for each host:

for i in $(cat hostlist.txt); do
  echo $(echo "<(traceroute -n -w1 -q1 ${i} | awk_func) \")
done | sed '$s/ \/ \| \/g'

Step 4: insert the line generated in Step 3 into the script below, as appropriate:

trace() {
  clear
  while [ true ]
  do
    pr -w 260 -m -t \
    <(traceroute -n -w1 -q1 google.com | awk_func) \
    <(traceroute -n -w1 -q1 yahoo.com | awk_func) \
    <(traceroute -n -w1 -q1 yandex.ru | awk_func) \
    <(traceroute -n -w1 -q1 att.com | awk_func) | \
    sed -r 's/to /@/g' | sed -r 's/(@)([a0-z9_-\.]{1,})/ /g' | \
    grep -v "^$(date +'%Y')" | column -t | \
    egrep --color=auto '([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\s+[0-9]{3,}\.[0-9]{1,}|$'
    sleep 5 && clear
  done
}

Step 5: runs trace function and you should see something like the example below. Note that the egrep --color line in the trace function will highlight devices showing latency in triple-digits: s+[0-9]{3,}, which you can modify as needed. The traceroute loop will continue until you cancel it.

So far we’ve been testing multiple remote hosts from the same local server. We may also need to test the same remote host from multiple local servers. Here’s a quick example that requires passwordless root SSH, which you can enable temporarily. Also note that I adjusted the egrep --color regex to highlight nodes with latency of 30ms or greater.

trace2() {
  clear
  while [ true ]
  do
    pr -w 260 -m -t \
    <(traceroute -n -w1 -q1 google.com | awk_func) \
    <(ssh -qtT root@ncc1701 "traceroute -n -w1 -q1 google.com" | awk_func) | \
    sed -r 's/to /@/g' | sed -r 's/(@)([a0-z9_-\.]{1,})/ /g' | \
    grep -v "^$(date +'%Y')" | column -t | \
    egrep --color=auto '([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\s+[3-9]{2,}\.[0-9]{1,}|$'
    sleep 5 && clear
  done
}

And here’s a sample output:

More side-by-side examples

While I am on the subject of pr, here are a few more example of how this command can be useful. In this example we’re comparing primary interface stats for the localhost and a remote server:

clear && while [ true ] ; do \
pr -t -w 260 -m -t \
<(/sbin/ifconfig $(/sbin/route | grep -m1 ^default | awk '{print $NF}') | grep X) \
<(ssh -qtT ncc1701 "/sbin/ifconfig $(/sbin/route | grep -m1 ^default | awk '{print $NF}') | grep X")
sleep 5 && clear; done

And the sample output:

Another option for viewing output of multiple commands side by side is vimdiff (up to four side-by-side windows). Here’s an example comparing kernel routing table on two remote servers:

timeout 5 vimdiff <(ssh -qtT host1 "sudo su - root -c 'route -n'") <(ssh -qtT host2 "sudo su - root -c 'route -n'") 2>/dev/null ; reset

Network bandwidth tests

So you want to know how fast your gigabit link really is in practical terms? There are a couple of simple tools that can help.

You can use the pv utility if you have SSH connectivity between the nodes. The syntax is very simple:

yes | pv | ssh  "cat > /dev/null"
19.5MB 0:00:06 [3.76MB/s] [                            <=>

A more sophisticated test can be performed using iperf. With this tool you can use different ports, run a bi-directional transfer, specify multiple threads, test with UDP, and much more. Here’s a basic example:

# Tell server A to listen on port 2222 on the primary NIC
# for up to 10 concurrent connections with 128K receive buffer
iperf -s -l 128k -p 2222 -P 10

# Tell server B to establish four concurrent connections to
# server A with 128K send buffer for three minutes and to
# update progress information every 5 seconds
iperf -c <server_A_IP> -l 128k -P 4 -i 5 -t 180 -p 2222

The ability of iperf to initiate multiple transfer threads is very useful, as some network performance issues may result in degraded speed on per-connection basis. Running multiple threads will show if the overall network throughput is limited or if it is close to maximum available bandwidth.

A good example here would be the TCP window overflow problem when the receiver for whatever reason runs out of the receive buffer space for individual connections.

You may also want to transfer some real-world data using rsync and use that for your testing. In the following example we generate some sample data and run rsync over SSH:

f=$(mktemp)

# Populate it with a large volume of text
curl -s0 -k https://norvig.com/big.txt > ${f}

# Generate ten folders with a hundred 4MB test files in each
# for a total of 4GB of data
d=/opt/speedtest
mkdir -p ${d}
for i in $(seq -w 01 10); do mkdir -p dir_${i}; echo "Populating dir_${i}"
for j in $(seq -w 001 100); do head -c 4096KB <(shuf -n 100000 $f) > ./dir_${i}/file_${j}
done; done

# Run rsync
time rsync -av ${d}/ ncc1711:${d}/

If you’re looking to just test your Internet connection link, check out this article. And here I have fairly complete listing of various CLI system monitoring tools (including network).

Running tcpdump

While your bandwidth test is running and you’re waiting, it would make sense to capture some packets between source and target servers for further analysis. Ideally, you want to capture traffic from both sides of the data transfer simultaneously and limit captured data to the two IPs and the specific port.

t="/opt/tmp"
mkdir -p ${t}
target_ip="192.168.122.117"
target_port="2222"
nic="eth0"

timeout 180 tcpdump -i ${nic} host ${target_ip} and \
port ${target_port} -n -s 0 -vvv \
-w ${t}/${target_ip}_${target_port}_$(date +'%Y-%m-%d_%H%M%S').pcap

Of course, the tcpdump command has its own timeout feature, I found that doesn’t always work as expected and using the timeout command is just more reliable.