Search code examples
cluster-computinglsf

How to rerun EXITing parts of an LSF job array?


I submit an array of many jobs to an LSF cluster. Most run and finish in the DONE state but some may EXIT. I need a way to have only any EXITing member jobs of the array to be re-run.

Thanks.


Solution

  • I've been playing around with the same issues and the command:

    brequeue -e <jobarrayid>
    

    should do what you're after. You don't need need to specify which elements should be rerun, the -e switch should pick out the EXIT'd indexes only.