I have a script that will track a process and if that process dies, it will respawn it. I want the tracking script to also kill the process if told to do so by giving the tracking script a sigterm (for example.). In other words, if I kill the tracking script, it should also kill the process that it's tracking, not respawn anymore and exit.
Cobbling together several posts (which I think are the best practices, for instance don't use a PID file), I get the following:
#!/bin/bash
DESC="Foo Manager"
EXEC="python /myPath/bin/FooManager.pyc"
trap "BREAK=1;pkill -HUP -P $BASHPID;exit 0" SIGHUP SIGINT SIGTERM
until $EXEC
do
echo "Server $DESC crashed with exit code $?. Restarting..." >&2
((BREAK!=0)) && echo "Breaking" && exit 1
sleep 1
done
So, now if I run this script in one xterm. And then in another xterm I send the script something like:
kill -HUP <tracking_script_pid> # Doesn't work.
kill -TERM <tracking_script_pid> #Doesn't work.
The tracking script does not end or anything. If I run FooManager.pyc from the commandline, it will die on SIGHUP and SIGTERM. Anyways, what could I be doing wrong here, and perhaps there's a whole different way to do it?
thanks.
From the manual:
If Bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the
wait
builtin, the reception of a signal for which a trap has been set will cause thewait
builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.
Emphasis is mine.
So in your case, while your command is executing, Bash will wait until it ends before it triggers the trap.
To fix this, you need to run your program as a job, and wait for it. If your program never exits with a return code greater than 128, you could simplify the following code, but I'm not making this assumption:
#!/bin/bash
desc="Foo Manager"
to_exec=( python "/myPath/bin/FooManager.pyc" )
trap 'trap_triggered=true' SIGHUP SIGINT SIGTERM
trap_triggered=false
while ! $trap_triggered; do
"${to_exec[@]}" &
job_pid=$!
wait $job_pid
job_ret=$?
if [[ $job_ret = 0 ]]; then
echo >&2 "Job ended gracefully with no errors... quitting..."
break
elif ! $trap_triggered; then
echo >&2 "Server $desc crashed with exit code $job_ret. Restarting..."
else
printf >&2 "Received fatal signal... "
if kill -0 $job_pid >&/dev/null; then
printf >&2 "killing job $job_pid... "
kill $job_pid
wait $job_pid
fi
printf >&2 "quitting...\n"
fi
done
Notes.