Brief Guide to PDSh
PDSh (Parallel Distributed Shell) is a high-performance parallel remote shell utility allowing you to execute commands on multiple remote hosts simultaneously. The utility was originally developed by LLNL and is currently available under GNU CPLv2. Below are some quick instructions and a few basic examples to get you started.
Configure key-based SSH authentication for all nodes. Verify passwordless SSH access.
Add the following to your ~/.bashrc
alias ssh='/usr/bin/ssh -o StrictHostKeyChecking=no' export PDSH_SSH_ARGS_APPEND="-q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o PreferredAuthentications=publickey"
Common options:
-b (batch mode) -t (connection timeout in seconds; default is 10 seconds) -u (limit remote command execution time) -w (list of hosts) -N (suppress hostname in the pdsh ouput)
Host list file format:
host1.domain,host2.domain,host3,domain
Example 1:
Restart snmpd process on all nodes in the list
pdsh -b -t 10 -u 15 -w ^/home/user/server_list_pdsh.txt "sudo su - root -c '/sbin/service snmpd restart'" >/dev/null 2>&1
NOTE: provide full path to the host list; prepend full path with ^
Example 2:
Launch a background process on node17-node32
pdsh -w node[17-32] "stress --cpu 10 --io 1 --vm 1 --vm-bytes 128M --timeout 30 --verbose &disown"
Example 3:
Run “uptime” command on servers node01-node19
pdsh -b -t 10 -u 15 -w node[0-1][0-9] "sudo su - root -c 'uptime'" 2>/dev/null
Example 4:
Check for oraagent.cin process on oracle01a-oracle19a
pdsh -b -t 10 -u 15 -w oracle[0-1][0-9]a "sudo su - oracle -c 'ps -ef | grep [o]raagent\.bin'" 2>/dev/null