Shutdown your hardware on temperature threshold

Faulty temperature control in a datacenter can cause terrible damage. In case you can't influence air con's reliability, you have to set up some protection on your end - that is: your server.

Most modern hard drives provide temperature sensors which can be used to trigger actions when reaching a threshold. I'll describe how to do this with a too called hddtemp and cron in Ubuntu 16.04.

We need to intall the tool's package:

apt install hddtemp

Now we create a little shell script which later we will trigger by cron on a regular basis. In this scenario we save the script in /usr/local/bin and name it harddrive-watcher.sh:

#!/bin/bash
HOSTNAME=yourHostname
MAILTARGET=root
HDDS="/dev/sda /dev/sdb"
HDT=/usr/sbin/hddtemp
LOG=/usr/bin/logger
DOWN=/sbin/shutdown
ALERT_LEVEL_MAIL=35
ALERT_LEVEL_SHUTDOWN=50
for disk in $HDDS
do
  if [ -b $disk ]; then
        HDDTEMP=$($HDT $disk | awk '{ print $4}' | awk -F '°' '{ print $1}')
        if [ $HDDTEMP -ge $ALERT_LEVEL_MAIL ]; then
           $LOG "Warning: hard disk $disk temperature reached its warning limit of $HDDTEMP°C"
           echo "Warning: hard disk $disk temperature reached its warning limit of $HDDTEMP°C" | mail -s $HOSTNAME $MAILTARGET
        fi
        if [ $HDDTEMP -ge $ALERT_LEVEL_SHUTDOWN ]; then
           $LOG "Emergency shutown: system going down as hard disk $disk temperature reached its final limit of $HDDTEMP°C"
           sync;sync
           $DOWN -h 0
        fi
  fi
done

Lastly, we add the following line to root's cron file:

*/5 * * * * /usr/local/harddrive-watcher.sh

Now there will be a temperature check every 5 minutes with two thresholds: If the first one is reached you'll receive a mail. At the second threshold the system will shutdown to protect its data from thermal damage.

Leave a Comment