Search code examples
bashdependenciesjobsslurmsbatch

sbatch depending on the completion of a job array gives a error: Batch job submission failed: Job dependency problem


I am trying to write a bash script that does some stuff, then starts a job array using sbatch, and when all jobs in the array have finished successfully, starts another job using sbatch. To have everything in one file, I use HereDocs for the SLURM scripts.

Submitting the job array works fine and the job ID I get when submitting the job array with the flag --parsable is correct. When trying to submit the last job, which depends on the successful completion of all jobs in the array, I get the error sbatch: error: Batch job submission failed: Job dependency problem

An uncluttered version of my bash script:

#!/bin/bash

# do some stuff to get the directory into which the python script
# running in the array jobs should write its results.
# every job in the array will make its own subdirectory
resdir="the/result/dir/"

# run a python script in a job array
# this part is working fine
jid= sbatch --parsable << EOF
#!/bin/bash
# ...
# configure the SBATCH stuff
# ...
#SBATCH --array=0-9
#
# do the conda stuff
#
# run the test
python main.py --chunk \$SLURM_ARRAY_TASK_ID --resdir $resdir
EOF

echo $jid #this echoes the correct job ID

# process results after the job array is done
# this part gives the error
# sbatch: error: Batch job submission failed: Job dependency problem
sbatch --dependency=afterok:$jid << EOF
#!/bin/bash
#
# configure the SBATCH stuff
#
# do the conda stuff
#
python process_results.py --dir $resdir
EOF

Solution

  • The problem was a missing $( and ). Replacing the jid= sbatch... part with

    jid=$(sbatch --parsable << EOF
    #!/bin/bash
    # ...
    # configure the SBATCH stuff
    # ...
    #SBATCH --array=0-9
    #
    # do the conda stuff
    #
    # run the test
    python main.py --chunk \$SLURM_ARRAY_TASK_ID --resdir $resdir
    EOF
    )
    

    does the trick.