| NFS I/O Stats with LoggingKrazyWorks

Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells, Featured, Networking, Performance

NFS I/O Stats with Logging

Submitted by Igor on January 3, 2022 – 1:43 pm

The nfsiostat is an excellent tool for analyzing NFS performance. The only major problem with this utility is that it’s designed to be used interactively. It does not have logging or timestamping capabilities. I needed to get information about NFS mount performance during a nightly backup, and I didn’t really feel like sitting in front of the computer at three in the morning, and so I put together a small wrapper script for nfsiostat.

The output produced by nfsiostat is spread over multiple lines and is not easily greppable. So what I needed to do is merge the output into a single line for each update, remove any extraneous characters and spaces, prepend a timestamp, and write the output to a log file.

The difficulty here was the buffered output generated by nfsiostatthat also includes some Windows-like control characters (^M, for example – no idea why). So I had to rely on the unbuffer utility to pipe the output of nfsiostat to multiple other commands that would massage the text into the desired format.

The script is below, and you can also download it from my GitHub repo. There are a few basic prerequisites listed in the script.

nfsiostat_wrapper.sh

#!/bin/bash
#
#                                      |
#                                  ___/"\___
#                          __________/ o \__________
#                            (I) (G) \___/ (O) (R)
#                                Igor Oseledko
#                           igor@comradegeneral.com
#                                 2021-08-23
# ----------------------------------------------------------------------------
# A wrapper script for the nfsiostat command that will generate timestamped
# CSV output.
# Prerequisites: timeout, dos2unix, unbuffer, nfsiostat
# ----------------------------------------------------------------------------
# Change Log:
# ****************************************************************************
# 2021-08-23	igor	Wrote this script
# ****************************************************************************
interval=""
count=""
fs=""

configure() {
  this_script=$(basename "$(test -L "$0" && readlink "$0" || echo "$0")")
  this_script_full="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)/$(basename "${BASH_SOURCE[0]}")"
  this_host=$(hostname | awk -F'.' '{print $1}')
  this_time_db=$(date +'%Y-%m-%d %H:%M:%S')
  this_time=$(date -d "${this_time_db}" +'%Y-%m-%d_%H%M%S')
  outdir="/var/tmp/nfsiostat"
  mkdir -p "${outdir}"
  outfile="${outdir}/nfsiostat_${this_host}_${this_time}.csv"
  tmpfile=$(mktemp)
  if [ -z "${interval}" ] || [ -z "${count}" ] || [ -z "${fs}" ]; then exit 21; fi
  mountpoint "${fs}" 2>/dev/null 1>&2 || exit 21
  NFSIOSTAT=$(which nfsiostat 2>/dev/null) || exit 23
  TIMEOUT=$(which timeout 2>/dev/null) || exit 25
  UNBUFFER=$(which unbuffer 2>/dev/null) || exit 25
  DOS2UNIX=$(which dos2unix 2>/dev/null) || exit 27
}

do_nfsiostat() {
  (( timer = ( interval * count ) + 1 ))
  ${TIMEOUT} ${timer} ${NFSIOSTAT} ${interval} ${count} "${fs}" 2>/dev/null | ${UNBUFFER} -p grep --line-buffer -v o | \
  ${UNBUFFER} -p ts '%Y-%m-%d_%H:%M:%S' | grep --line-buffer '\.' >> "${outfile}"
  ${DOS2UNIX} "${outfile}"
  sed -i 'N;N;s/\n/ /g' "${outfile}"
  cat "${outfile}" | while read line; do
    t="$(echo "${line}" | awk '{print $1}')"
    sed -i "s/${t}//2g" "${outfile}"
  done
  sed -ri 's/\s{1,}/,/g' "${outfile}"
  sed -i 's/_/ /g' "${outfile}"
  tail -n +2 "${outfile}" > "${tmpfile}"
  cat "${tmpfile}" | (echo "timestamp,op_s,rpc_bklog,r_ops_s,r_kB_s,r_kB_op,r_retrans,r_retrans_pct,r_avg_RTT_ms,r_avg_exe_ms,w_ops_s,w_kB_s,w_kB_op,w_retrans,w_retrans_pct,w_avg_RTT_ms,w_avg_exe_ms" && cat) > "${outfile}"
  echo "${outfile}"
}

# ----------------------------------------------------------------------------
# RUNTIME
# \(^_^)/                                      __|__
#                                     __|__ *---o0o---*
#                            __|__ *---o0o---*
#                         *---o0o---*
# ----------------------------------------------------------------------------
configure
do_nfsiostat

The wrapper script takes three arguments: refresh interval in seconds, refresh count, and NFS mountpoint (only one is supported). For example, if your NFS mount is /downloads and you want to gather NFS performance stats every 5 seconds for an hour, then the command will be:

./nfsiostat_wrapper.sh 5 720 "/downloads"

# I suggest you run it using nohup like so:
nohup ./nfsiostat_wrapper.sh 5 720 "/downloads" </dev/null >/dev/null 2>&1 &

# You can also schedule it via atd. When scheduling via atd or cron, make sure to
# provide the absolute path to the wrapper script.
at "3:10am tomorrow" <<<"/var/adm/bin/nfsiostat_wrapper.sh 5 720 /downloads"

The script will place the resulting CSV file in /var/tmp/nfsiostat folder. The CSV header will contain all the metrics displayed by nfsiostat. Here’s an example:

Sample output

timestamp,op_s,rpc_bklog,r_ops_s,r_kB_s,r_kB_op,r_retrans,r_retrans_pct,r_avg_RTT_ms,r_avg_exe_ms,w_ops_s,w_kB_s,w_kB_op,w_retrans,w_retrans_pct,w_avg_RTT_ms,w_avg_exe_ms
2021-08-24 05:47:49,1755.00,0.00,1742.200,111943.155,64.254,0,(0.0%),45.016,45.283,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:47:54,1449.00,0.00,1450.200,93181.015,64.254,0,(0.0%),54.195,54.515,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:47:59,678.00,0.00,681.200,43769.761,64.254,0,(0.0%),54.361,54.736,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:04,1677.00,0.00,1672.800,107483.934,64.254,0,(0.0%),46.752,46.967,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:09,1479.00,0.00,1479.800,95082.930,64.254,0,(0.0%),47.656,47.907,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:14,1680.00,0.00,1678.400,107843.756,64.254,0,(0.0%),41.711,41.976,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:19,1512.00,0.00,1515.400,97370.370,64.254,0,(0.0%),49.505,49.753,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:24,522.00,0.00,520.600,33450.584,64.254,0,(0.0%),65.567,65.957,0.000,0.000,0.000,0,(0.0%),0.000,0.000
2021-08-24 05:48:29,624.00,0.00,623.200,40043.034,64.254,0,(0.0%),82.199,82.543,0.000,0.000,0.000,0,(0.0%),0.000,0.000

I was pretty proud of myself for getting this to work, but looking at this mess of data, I knew I would need to make a timeline chart. Here’s the list of all the fields in the CSV file:

1   timestamp
2   op_s
3   rpc_bklog
4   r_ops_s
5   r_kB_s
6   r_kB_op
7   r_retrans
8   r_retrans_pct
9   r_avg_RTT_ms
10  r_avg_exe_ms
11  w_ops_s
12  w_kB_s
13  w_kB_op
14  w_retrans
15  w_retrans_pct
16  w_avg_RTT_ms
17  w_avg_exe_ms

I will plot field #4 (read operations per second) and field #9 (read average RTT) in the example below. You can combine multiple metrics, as long as it makes sense to consider them together and you’re not trying to compare apples to oranges.

gnuplot -e "set title 'nfsiostat ncc1711 2021-08-24 05:49 /downloads'; 
set key;
set datafile separator ',';
set offset 1,1,1,1; 
set autoscale xy; 
set mxtics; set mytics; 
set style line 12 lc rgb '#ddccdd' lt 1 lw 1.5; set style line 13 lc rgb '#ddccdd' lt 1 lw 0.5; 
set grid xtics mxtics ytics mytics back ls 12, ls 13; 
set terminal png size 3840,2160 enhanced font '/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf,22';
set output '/mnt/nas04/downloads/${logfile}.png'; 
set style data labels;
set timefmt '%Y-%m-%d %H:%M:%S';
set xdata time; 
set xlabel 'Date' font '/usr/share/fonts/liberation/LiberationSans-Regular.ttf,20'; 
set ylabel 'Metrics' font '/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf,20'; 
plot \
'${logfile}' using 1:4 title '$(head -1 ${logfile} | awk -F, '{print $4}' | sed 's/_/ /g')' with histeps lw 3, \
'${logfile}' using 1:9 title '$(head -1 ${logfile} | awk -F, '{print $9}' | sed 's/_/ /g')' with histeps lw 3"

Sample output

When troubleshooting NFS performance issues, the critical fields to watch are r/w retransmit (fields 7 & 14) and r/w avg RTT (fields 9 & 16). A substantial increase in these four parameters may indicate a performance bottleneck resulting from network misconfiguration or an issue with the NFS client or your NAS.