Search code examples
lsf

How to extend time for program to finish after USR2 signal


I have got a program that will run for a very long time on my universities LSF cluster. I don't know if it will finish before it exceeds its job's time limit. If a job exceeds the time limit, the LSF system will send increasingly unfriendly termination signals to the program before it is finally killed. I programmed the code to catch the USR2 signal and save its data, however this will need a few minutes. In my university's guide to using the LSF system, it states that the option

-ta USR2 -wt [hh:]mm

extends the time limit the program has to react to USR2.

I already tried to following options:

-ta USR2 -wt '00:20'
-ta USR2 -wt 00:20
-ta USR2 -wt 20
-ta USR2 -wt '20'

and all of the above where

USR2

is replaced by

'USR2'

I hoped that the job would be submitted, but there is an error occuring:

a: Bad time specification. Job not submitted.

Solution

  • I think that you want

    -wa USR2 -wt 20
    

    -ta isn't a bsub option. So bsub thinks you're asking for a termination deadline -t with a time spec of a. Hence the error message

    a: Bad time specification. Job not submitted.