I have two bash scripts, one as an entrypoint to my docker container, looking like this:
#!/bin/bash
sig_handler()
{
echo "[LAYER1] killing children with pid "$pid
[[ $pid ]] && kill $pid
exit 1
}
trap 'sig_handler' SIGINT SIGTERM
while true; do
./nextlayer.sh & pid=$!
wait $pid
echo "Waiting 5 seconds before starting a new worker..."
sleep 5
done
Now nextlayer.sh has some kind of signal trapping too, BUT also tries to clean up some stuff, like so:
#!/bin/bash
sig_handler()
{
echo "[LAYER2] Exiting main script and cleaning up tasks"
cleanup
}
cleanup()
{
echo "[LAYER2] Cleaning up"
sleep 5
echo "Cleanup done, exiting with SIGTERM"
exit 143
}
trap 'sig_handler' SIGINT SIGTERM
i=0
while [ $i -lt 10 ]
do
i=$(( $i + 1 ))
sleep 1
echo $i
done
This just loops over i while i is < 10 and echoes the result every second. When running the entrypoint script locally without docker and pressing ctrl + c to exit, the script terminates as expected and outputs the cleanup function content.
However, when applying the same logic onto a docker container, the SIGTERM only reaches the sighandler in the entrypoint script, but never in the second layer script. Anyone know what I'm doing wrong?
I found the solution.
in the sig_handler function of the entrypoint script, the following wait instruction solves it:
sig_handler()
{
echo "[LAYER1] killing children with pid "$pid
[[ $pid ]] && kill $pid
wait $pid # this is crucial
exit 1
}
So before actually quitting the container, the wait $pid forces it to actually wait out the exit from the subsequent script. I tested this 5 iterations of scripts, it all works.