Search code examples
arraysloggingslurmhpc

Gobal log for job array [SLURM]


In addition creating the individual log file for each array-job (ie, sbatch --output=job_%A_%a.out ...), is there any way of getting a final report with all run jobs and see what ones have crashed? Something like that:

839594_1    COMPLETE
839594_2    FAIL
839594_3    COMPLETE
839594_4    COMPLETE
839594_5    COMPLETE
839594_6    FAIL

Solution

  • You should obtain the output you want with the sacct command:

    sacct -X -j 839594 -o jobid%-30,state
    

    You can also add -n to suppress the display of the column headers.

    Note that all PENDING jobs will be summarised into a single line like so:

    839594_[10-20]         PENDING
    

    if tasks 10 to 20 are still pending.