Search code examples
cluster-computingslurmhpc

slurm ignores dependency on running job


Suppose that on a cluster with slurm the job with ID 12345 is currently running. I want to submit another job that will start after this job finishes. I tried sbatch -d after:12345 job.script, but I noticed that scontrol show job 12346 displays Dependency=(null). I therefore tried scontrol update JobId=12346 dependency=after:12345, but scontrol still shows Dependency=(null). Why is this dependency ignored? Can I change anything to make this work as desired? I don't see this problem if the dependency is a job that is not running.


Solution

  • With -d after:12345, you are setting a dependency on the start of job 12345. As that job is currently running, the dependency is void in practice.

    What you want is either

    • -d afterok:12345 to set a dependency on the successful completion of job 12345; or
    • -d afterany:12345 to set a dependency on the end (successful, canceled, or failed) of job 12345