Monthly Archives: February 2016

Shutdown your hardware on temperature threshold

Faulty temperature control in a datacenter can cause terrible damage. In case you can't influence air con's reliability, you have to set up some protection on your end - that is: your server.

Most modern hard drives provide temperature sensors which can be used to trigger actions when reaching a threshold. I'll describe how to do this with a too called `hddtemp` and `cron` in Ubuntu 16.04.

We need to intall the tool's package:

apt install hddtemp

Now we create a little shell script which later we will trigger by `cron` on a regular basis. In this scenario we save the script in `/usr/local/bin` and name it `harddrive-watcher.sh`:

#!/bin/bash
HOSTNAME=yourHostname
MAILTARGET=root
HDDS="/dev/sda /dev/sdb"
HDT=/usr/sbin/hddtemp
LOG=/usr/bin/logger
DOWN=/sbin/shutdown
ALERT_LEVEL_MAIL=35
ALERT_LEVEL_SHUTDOWN=50
for disk in $HDDS
do
if [ -b $disk ]; then
HDDTEMP=$($HDT $disk | awk '{ print $4}' | awk -F '°' '{ print $1}')
if [ $HDDTEMP -ge $ALERT_LEVEL_MAIL ]; then
$LOG "Warning: hard disk $disk temperature reached its warning limit of $HDDTEMP°C"
echo "Warning: hard disk $disk temperature reached its warning limit of $HDDTEMP°C" | mail -s $HOSTNAME $MAILTARGET
fi
if [ $HDDTEMP -ge $ALERT_LEVEL_SHUTDOWN ]; then
$LOG "Emergency shutown: system going down as hard disk $disk temperature reached its final limit of $HDDTEMP°C"
sync;sync
$DOWN -h 0
fi
fi
done

Lastly, we add the following line to `root`'s `cron` file:

*/5 * * * * /usr/local/harddrive-watcher.sh

Now there will be a temperature check every 5 minutes with two thresholds: If the first one is reached you'll receive a mail. At the second threshold the system will shutdown to protect its data from thermal damage.