Search code examples
arraysbashsungridengine

How to run an array job within a pipeline of several holded jobs when the number of subjobs in the array depends on the result of a previous job


I am trying to write a bash script that sends several jobs to the cluster (SGE scheduler), and that each of them waits for the previous to end, such as:

HOLD_ID=$(qsub JOB1.sh | cut -c 10-16)
HOLD_ID=$(qsub -hold_jid $HOLD_ID JOB2.sh | cut -c 10-16)
HOLD_ID=$(qsub -hold_jid $HOLD_ID JOB3.sh | cut -c 10-16)

This works perfectly, however, now I want to add to this pipeline a holded array job, such as:

qsub -hold_jid $HOLD_ID -t 1-$NB_OF_SUBJOBS JOB4.sh

But here the number of sub-jobs ($NB_OF_SUBJOBS) I will have depends on the result of JOB2.sh.

I want this to be a fast, master script that just send all the jobs. I would not like to have a while + sleep or something like that, which was my first attempt. The job on which depends the number I need (JOB2.sh) is relatively long in time. As the last line is evaluated when submited, any variable or file with the number of sub-jobs created by the previous JOB2.sh will not work. Any ideas?

Many thanks,

David


Solution

  • So, if I understand, the submission of job 4 is predicated on obtaining information from the completion of job 2. If this is the case, it is clear that you will need to submit job 4 after job 2 completes, which is separate from submitting job 4 and having execution hold on completion of job 2.

    Why not use the -sync -y option on job 2 to have the submission of job 4 only occur after job 2 completes:

    qsub -hold_jid $HOLD_ID JOB2.sh -sync y
    

    Make sure to have job 2 output n_subjobs variable to somewhere like a file (n_subjobs.txt example below), or you can parse output into variable as you have done for job id. Then read this information when submitting job 4:

    qsub -t 1-$(cat n_subjobs.txt) JOB4.sh