Search code examples
bashslurmexit-code

SLURM status string on job completion / exit


How do I get the slurm job status (e.g. COMPLETED, FAILED, TIMEOUT, ...) on job completion (within the submission script)? I.e. I want to write to separately keep track of jobs which are timed out / failed.

Currently I work with the exit code, however jobs which TIMEOUT also get exit code 0.


Solution

  • For future reference, here is how I finally do it.

    To retrieve the jobid at the beginning of the job and write some information (e.g. "${SLURM_JOB_ID} ${PWD}") to a summary file.

    Then process this file and use something like sacct -X -n -o State --j ${jid} to get the job status.