Some of my scripts kick off a number of children in background, and then wait for their completion. Currently I invoke the wait
in a loop for each PID:
for pid in "${!PIDs[@]}"
do
if wait $pid
then
log ${PIDs[$pid]} completed successfully
else
log WARNING: ${PIDs[$pid]} failed
errors+=1
fi
done
This works, allowing me to analyze and process failures, but the processing happens in the order, in which PIDs are listed -- not in the order, in which the processes actually complete. That is, the 5th process may finish first, but its exit-code will not be processed until the first four are done...
As far as I know, sh
provides two modes for wait
:
wait
will wait for all backgrounded jobs to finish, but it will always "succeed" losing the exit-codes of the backgrounded processes.wait PID
will wait for the specified process. This is providing the exit-code, but can only wait for that one process.But, maybe, bash has this improved compared to the old sh? Is there a way to request bash's wait
to return when any of the backgrounded processes completes -- and have it provide both the finished PID and its exit-code?
The underlying C-functions waitpid
and friends can do this -- if you provide the PID of -1
. I tried doing that with bash and got an error...
Might take a few steps.
Just as a test, complete with one process killed to prove it catches error codes.
$: cat tst
#! /usr/bin/env bash
for x in 1 3 5 7 9; do sleep $x & done
declare -A rc=()
pids=($(jobs -pr))
while (( ${#pids[@]} ))
do for k in "${!pids[@]}"
do p=${pids[$k]}
if ps -p $p >/dev/null; then :
else wait $p; rc+=( $p $? ); unset pids[$k]
date +"%F %T PID $p: rc ${rc[$p]}"
fi
done
((skip++)) || kill ${pids[3]}
sleep 1
done
$: ./tst
2025-02-07 14:55:19 PID 3036: rc 0
2025-02-07 14:55:19 PID 3039: rc 143
2025-02-07 14:55:21 PID 3037: rc 0
2025-02-07 14:55:23 PID 3038: rc 0
2025-02-07 14:55:28 PID 3040: rc 0