Search code examples
bashkillrestartsigkill

How to restart a process in bash or kill it on command?


I have a script that will track a process and if that process dies, it will respawn it. I want the tracking script to also kill the process if told to do so by giving the tracking script a sigterm (for example.). In other words, if I kill the tracking script, it should also kill the process that it's tracking, not respawn anymore and exit.

Cobbling together several posts (which I think are the best practices, for instance don't use a PID file), I get the following:

#!/bin/bash

DESC="Foo Manager"
EXEC="python /myPath/bin/FooManager.pyc"

trap "BREAK=1;pkill -HUP -P $BASHPID;exit 0" SIGHUP SIGINT SIGTERM

until $EXEC
do
    echo "Server $DESC crashed with exit code $?.  Restarting..." >&2
    ((BREAK!=0)) && echo "Breaking" && exit 1
    sleep 1
done

So, now if I run this script in one xterm. And then in another xterm I send the script something like:

kill -HUP <tracking_script_pid>  # Doesn't work.
kill -TERM <tracking_script_pid>  #Doesn't work.

The tracking script does not end or anything. If I run FooManager.pyc from the commandline, it will die on SIGHUP and SIGTERM. Anyways, what could I be doing wrong here, and perhaps there's a whole different way to do it?

thanks.


Solution

  • From the manual:

    If Bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.

    Emphasis is mine.

    So in your case, while your command is executing, Bash will wait until it ends before it triggers the trap.

    To fix this, you need to run your program as a job, and wait for it. If your program never exits with a return code greater than 128, you could simplify the following code, but I'm not making this assumption:

    #!/bin/bash
    
    desc="Foo Manager"
    to_exec=( python "/myPath/bin/FooManager.pyc" )
    
    trap 'trap_triggered=true' SIGHUP SIGINT SIGTERM
    
    trap_triggered=false
    while ! $trap_triggered; do
       "${to_exec[@]}" &
       job_pid=$!
       wait $job_pid
       job_ret=$?
       if [[ $job_ret = 0 ]]; then
          echo >&2 "Job ended gracefully with no errors... quitting..."
          break
       elif ! $trap_triggered; then
          echo >&2 "Server $desc crashed with exit code $job_ret. Restarting..."
       else
          printf >&2 "Received fatal signal... "
          if kill -0 $job_pid >&/dev/null; then
              printf >&2 "killing job $job_pid... "
              kill $job_pid
              wait $job_pid
          fi
          printf >&2 "quitting...\n"
       fi
    done
    

    Notes.

    1. I used lowercase variable name, since uppercase are considered bad practice: they can clash with Bash's reserved names, or environmental variables.
    2. I didn't use a string to store the command, but an array. With a string, you'll have a lot of problems if you want to have funny characters like spaces passed as arguments. With a properly quoted array, you won't have any problems. (Some would argue that it would be even better to use a function.)