Salt Snippets
SaltStack has decent documentation (at least compared to Puppet), but it’s a bit lacking on examples. There’s plenty of simple stuff that is useful if you’re managing a handful of nodes. However, why would you need Salt, if you’re managing just a handful of nodes? Here’s a small collection of random Salt commands mostly geared toward remote command execution.
For the following examples I am using ten test nodes named node01 through node10. The example below shows how to run arbitrary shell commands. The output is set to “txt”, which is easier to parse than the default goofy Salt format with indentation and pretty colors. Additionally, the “timeout” option should be used at all times unless you’re a Martian.
salt --timeout=5 --output=txt 'node*' cmd.run "uname -a ; uptime"
Sample output:
... node05.domain.local: Linux node05.domain.local 2.6.32-431.5.1.el6.x86_64 #1 SMP Fri Jan 10 14:46:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux ...
As you see, Salt prepends each output line with the hostname. If that’s something that annoys you, we can parse it out:
salt --timeout=5 --output=txt 'node*' cmd.run "uname -a ; uptime" | tr -s ' ' | cut -d ' ' -f2-
You may want to target your minions in a more precise manner. Perhaps something along these lines:
salt --timeout=5 --output=txt -E 'node[0-9][0-9].wil*' cmd.run "uptime" salt --timeout=5 --output=txt -E 'node(0?[0-9]{1,2}).wil*' cmd.run "uptime"
The “-E” option causes the target expression to be interpreted as a PCRE regular expression rather than a shell glob.
You can concatenate multiple shell commands, but there is a limit to this method. Eventually you will reach the level of complexity when your single-quotes will walk all over your double-quotes. The solution is to use the “cmd.script” method.
Create /srv/salt/script folder. Write a script, make it executable, and place it in /srv/salt/script. Run it like so:
salt --timeout=5 --output=txt -E 'node[0-9][0-9].wil*' cmd.script salt://scripts/script.sh
Sample output:
... node02.domain.local: {'pid': 1850, 'retcode': 0, 'stderr': '', 'stdout': ' 21:56:01 up 69 days, 11:13, 1 user, load average: 3.49, 3.38, 3.17'} ...
Once again, you may want to parse the output to cut the extraneous bits, unless you like the extraneous bits:
salt --timeout=5 --output=txt -E 'node[0-9][0-9].wil*' cmd.script salt://scripts/node.sh | sed -e "s/'stdout': ' /@/g" -e "s/'}$//g" | cut -d '@' -f2-
Here’s a fun little example that will show you a user’s login activity across multiple systems. First, create a script in your salt-master script directory and call it /srv/salt/scripts/last_track.sh
#!/bin/bash if [ $1 ] ; then u="" ; else exit 1 ; fi t=$(last -n 1 ${u} | grep ^${u} | egrep -Eo "[JFANDOMS][a-z]{2}\s?[ 0-9]{2}\s[0-9]{2}:[0-9]{2}") if [ -z "${t}" ] ; then exit 0 ; fi d=$(date -d "${t}" +%s) if [ ${d} -gt `date +%s` ] ; then (( d = d - 31536000 )) ; fi date -d "@${d}" +'%Y-%m-%d %H:%M:%S'
Now run the script via salt and make sure to provide the <username> variable:
salt --timeout=10 --output=txt "dl*" cmd.script salt://scripts/last_track.sh <username> | sed -e "s/: {'/@/g" -e "s/': '/@/g" -e "s/'}/@/g" | awk -F'@' '{print $1","$(NF-1)}' | grep -E "[0-9]{4}\-[0-9]{2}" | sort -r -t, -k2 | column -s, -t
The “grains” feature is useful for a more diverse server environment, where you may want to select subsets of machines by CPU architecture, RAM allocation, hardware manufacturer, etc. Here are a few examples:
salt --timeout=5 --output=txt -G 'virtual:physical' cmd.run "uname -a" salt --timeout=5 --output=txt -G 'manufacturer:HP' cmd.run "uname -a" salt --timeout=5 --output=txt -G 'cpuarch:x86_64' cmd.run "uname -a" salt --timeout=5 --output=txt -G 'os:RedHat' cmd.run "uptime"
To see list of all available “grains”, run “salt ‘*’ grains.items”
The command below will display your RHEL5 servers with the highest swap utilization:
salt --timeout=5 --output=txt -G "osfinger:Red*5" cmd.run "free -k | grep ^Swap:" | awk '{print $1"\t"$4}' | grep -vE "0{1}$" | sort -k2 -rn | column -t
Here’s a command to show your most favorite servers:
salt --timeout=5 --output=txt "*" cmd.run "last your_username | grep -c ^your_username" | sort -k2 -rn | head -10 | column -t
To see memory allocated to all of your Tomcat servers:
salt --timeout=5 --output=txt "*tomcat*" cmd.run "free -k | grep ^Mem:" | cut -d ' ' -f3- | awk '{ SUM += $1} END { print ( SUM/1024/1024 )" GB" }'
And here’s a line to show you the total size of all LVM volumes on your WebLogic servers:
salt --timeout=5 --output=txt "*weblogic*" cmd.run "vgs --units=k" | cut -d ' ' -f3- | awk '{print $6}' | grep -oE "[0-9]{1,100}\.[0-9]{2}" | awk '{ SUM += $1} END { print ( SUM/1024/1024 )" GB" }'
This command will show you the total number of CPU cores on your i686 systems:
salt --timeout=5 --output=txt -G "osarch:i686" cmd.run "cat /proc/cpuinfo" | cut -d ' ' -f2- | grep -c ^processor | awk '{ SUM += $1} END { print ( SUM )" cores" }'
This command will show any servers with local filesystem utilization above 90%:
salt --output=txt --timeout=5 "*" cmd.run "df -hPl | egrep -E '9[0-9]%|100%'" | grep % | column -t
Identify servers with 15-min load average in double-digits:
salt --timeout=5 --output=txt "*" cmd.run "uptime|egrep -E '[0-9]{2}\.[0-9]{2}$'"
Let’s look at some Salt maintenance tasks. From time to time salt-minion service dies. I don’t know why or how to fix it, but here’s how to check status and restart the salt-minion service. This example uses PDSH. To actually restart the service, just replace “status” with “restart”:
/usr/bin/pdsh -b -N -t 10 -u 15 -w `salt-run -t 20 manage.down 2>/dev/null | awk '{printf $1","}' | sed 's/,$//g'` "sudo su - root -c '/sbin/service salt-minion status'"
If you don’t have PDSH, here’s the SSH version of the above command:
for i in `salt-run -t 20 manage.down 2>/dev/null | awk '{print $1}'` ; do timeout 5 ssh -qt ${i} "sudo su - root -c '/sbin/service salt-minion status'" ; done
There you go. Have fun.
Good resources: