Suspend a script in slurm until another child job is finished

I have a pipeline with a control script containing multiple sbatch calls to others. I want to hold the main script until each sbatch call is finished, but I don't want to keep it in execution a lot of hours. Is there a way to pause it in the slurm queue until the child processes end?

I tried something like this:

#!/bin/bash
#SBATCH --job-name PIPE
#SBATCH --partition=fast
#SBATCH --time=00:10:00

WDIR=[a path]
sfile=[a file]

printf "\nSTEP 1: -------------------------------------------------------\n"
cd $WDIR
sbatch --partition=fast,medium --time=00:08:00 --mem=20GB --cpus-per-task=20 --wait \
  launch_prepfasta.sh --sampfile $sfile --odir $WDIR/00_raw
wait

printf "\nSTEP 2: -------------------------------------------------------\n"
if [ ! -d $WDIR/S2 ];then mkdir -p $WDIR/S2; fi
cd $WDIR/S2
scontrol suspend $SLURM_JOB_ID
sbatch --partition=fast,medium --time=02:00:00 --mem=24GB --cpus-per-task=4 --wait \
  launch_mmseqs.sh --sdir $WDIR/00_raw --odir $WDIR/S2
wait
scontrol resume $SLURM_JOB_ID

But it does not work for me, and I do not know if it is correctly coded.

Since the first job is not time-consuming, I don't mind to make the main script wait. But the STEP 2 can be slow, so I need a way to pause the main without consuming time in the slurm queue, and resume it when launch_mmseqs.sh is finished, if it is possible.

Solution

Please note that scontrol suspend will not free the resources, only suspend the processes linked to that job on the compute nodes allocated to it. Furthermore, as soon as the job suspends itself, it stops running commands so the subsequent sbatch command will not run, nor with the last scontrol resume $SLURM_JOB_ID.

One option is to use `--dependency`` to link both submitted jobs together and let the submitting job complete.

#!/bin/bash
#SBATCH --job-name PIPE
#SBATCH --partition=fast
#SBATCH --time=00:10:00

WDIR=[a path]
sfile=[a file]

printf "\nSTEP 1: -------------------------------------------------------\n"
cd $WDIR
JOBID1=$(sbatch --parsable --partition=fast,medium --time=00:08:00 --mem=20GB --cpus-per-task=20 launch_prepfasta.sh --sampfile $sfile --odir $WDIR/00_raw)


printf "\nSTEP 2: -------------------------------------------------------\n"
if [ ! -d $WDIR/S2 ];then mkdir -p $WDIR/S2; fi
cd $WDIR/S2

sbatch --dependency=afterok${JOBID1} --partition=fast,medium --time=02:00:00 --mem=24GB --cpus-per-task=4 --wait \
  launch_mmseqs.sh --sdir $WDIR/00_raw --odir $WDIR/S2

The above submission script, when running, will cd to $WDIR and submit the first job from there, saving its job id into $JOBID1, then create $WDIR/S2, cd there, and submit the second job with a dependency on the first one.

It will then immediately terminate and free up the resources, leaving two jobs in the queue, the second one configured to only start when the first one has completed.