Search code examples
linuxshellscriptingdovecot

Dovecot/Amavis restart script (work in progress)


We have a mail server that is dying and in the process of having accounts migrated to a new server before decommissioning. With 800+ email accounts across 25+ domains, it is important for this machine to stay up until migration is finished.

Lately it has started to fill up with error logs, which freeze mysql because of no space, stop mail flow, and generally give me a headache. Until the root problem of the errors can be found and fixed, I have come up with a script to check if Dovecot and Amavis-new are running, and if not restarts them.

After reading: https://stackoverflow.com/a/7096003/4820993

As well as a few other common examples, I came up with this.

netstat -an|grep -ce ':993.*LISTEN' >/dev/null 2>&1

if [ $? = 0 ]
then
    echo 'Dovecot is up';
else
    echo 'Dovecot is down, restarting...';
        /etc/init.d/dovecot restart
        logger -p mail.info dovecot_keepalive: Dovecot is down, restarting...
fi

/etc/init.d/amavis status |grep -ce 'running' >/dev/null 2>&1

if [ $? = 0 ]
then
    echo 'AmavisD is up';
else
    echo 'AmavisD is down, restarting...';
    /etc/init.d/amavis restart
    sleep 2
    /etc/init.d/amavis status |grep -ce 'running' >/dev/null 2>&1
        if [ $? = 1 ]
        then
            echo 'AmavisD had a problem restarting, trying to fix it now...';
            logger -p mail.info amavis_keepalive: AmavisD had a problem restarting...
            output=$(ps aux|grep a\[m\]avisd)
            set -- $output
            pid=$2
            kill $pid
            rm /var/run/amavis/amavisd.pid
            /etc/init.d/amavis start
        else
            echo 'AmavisD restarted successfully';
            logger -p mail.info amavis_keepalive: AmavisD is down, restarting...
        fi
fi

Who knows, I'm probably making it harder that it is, and if so PLEASE LET ME KNOW!!!

I checked it against http://www.shellcheck.net and updated/corrected according to it's debug reports. I am piecing this together from examples elsewhere and would love someone to proofread this before I implement it.

The first part checking dovecot is already working just fine as a cronjob every 6 hours (yes the server is that messed up that we need to check it), it's the section about amavis I'm not sure about.


Solution

  • You can use Monit which will monitor your services and restart itself.

    Amavisd:

    # File: /etc/monit.d/amavisd
    # amavis
    check process amavisd with pidfile /var/amavis/amavisd.pid
       group services
       start program = "/etc/init.d/amavisd start"
       stop  program = "/etc/init.d/amavisd stop"
       if failed port 10024 then restart
       if 5 restarts within 5 cycles then timeout
    

    Dovecot:

    # File: /etc/monit.d/dovecot
    check process dovecot with pidfile /var/run/dovecot/master.pid
       start program = "/etc/init.d/dovecot start"
       stop program = "/etc/init.d/dovecot stop"
       group mail
       if failed host localhost port 993 type tcpssl sslauto protocol imap then restart
       if failed host localhost port 143 protocol imap  then restart
       if 5 restarts within 5 cycles then timeout
       depends dovecot_init
       depends dovecot_bin
    check file dovecot_init with path /etc/init.d/dovecot
       group mail
    check file dovecot_bin with path /usr/sbin/dovecot
       group mail