Search code examples
linuxbashshellkill

BASH: How monitor a script for execution failure


I'm using Linux to watch a script execution in order for it to be respawned when the script runs into an execution failure. Given is a simple 1-line script which should help demonstrate the problem.

Here's my script

#!/bin/bash

echo '**************************************'
echo '*          Run IRC Bot               *'
echo '**************************************'
echo '';
if [ -z "$1" ] 
    then
        echo 'Example usage: ' $0 'intelbot'
fi

until `php $1.php`; 
    do 
        echo "IRC bot '$1' crashed with the code $?.  Respawning.." >&2; 
        sleep 5 
done;

What kill option should I use to say to until, hey I want this process to be killed and I want you to get it working again!

Edit The aim here was to manually check for a script-execution failure so the IRC Bot can be re-spawned. The posted answer is very detailed so +1 to the contributor - a supervisor is indeed the best way to tackle this problem.


Solution

  • First -- don't do this at all; use a proper process supervision system to automate restarting your program for you, not a shell script. Your operating system will ship with one, be it SysV init's /etc/inittab (which, yes, will restart programs so listed when they exit if given an appropriate flag), or the more modern upstart (shipped with Ubuntu), systemd (shipped with current Fedora and Arch Linux), runit, daemontools, supervisord, launchd (shipped with MacOS X), etc.


    Second: The backticks actually make your code behave in unpredictable ways; so does the lack of quotes on an expansion.

    `php $1.php`
    

    ...does the following:

    • Substitutes the value of $1 into a string; let's say it's my * code.php.
    • String-splits that value; in this case, it would change it into three separate arguments: my, *, and code.php
    • Glob-expands those arguments; in this case, the * would be replaced with a separate argument for each file in the current directory
    • Runs the resulting program
    • Reads the output that program wrote to stdout, and runs that output as a separate command
    • Returns the exit status of that separate command.

    Instead:

    until php "$1.php"; do 
        echo "IRC bot '$1' crashed with the code $?.  Respawning.." >&2; 
        sleep 5 
    done;
    

    Now, the exit status returned by PHP when it receives a SIGTERM is something that can be controlled by PHP's signal handler -- unless you tell us how your PHP code is written, only codes which can't be handled (such as SIGKILL) will behave in a manner that's entirely consistent, and because they can't be handled, they're dangerous if your program needs to do any kind of safe shutdown or cleanup.

    If you want your PHP code to install a signal handler, so you can control its exit status when signaled, see http://php.net/manual/en/function.pcntl-signal.php