I want to start many independent tasks (job steps) as part of one job and want to keep track of the highest exit code of all these tasks.
Inspired by this question I am currently doing something like
#SBATCH stuf....
for i in {1..3}; do
srun -n 1 ./myprog ${i} >& task${i}.log &
done
wait
in my jobs.sh
, which I sbatch
, to start my tasks.
How can I define a variable exitcode
which, after the wait command, contains the highest exit code of all the tasks?
Thanks so much in advance!
You can store jobs' pids in an array and wait for each one, like this
#SBATCH stuf....
for i in {1..3}; do
srun -n 1 ./myprog ${i} >& task${i}.log &
pids+=($!)
done
for pid in ${pids[@]}; do
wait $pid
exitcode=$[$? > exitcode ? $? : exitcode]
done
echo $exitcode