Search code examples
arraysbashslurmjob-schedulinghpc

slurm - use array and limit the number of jobs running at the same time until they finish


Let's suppose I have the following bash script (bash.sh) to be run on a HPC using slurm:

#!/bin/bash
#SBATCH --job-name test
#SBATCH --ntasks 4
#SBATCH --time 00-05:00
#SBATCH --output out
#SBATCH --error err
#SBATCH --array=0-24

readarray -t VARS < file.txt
VAR=${VARS[$SLURM_ARRAY_TASK_ID]}
export VAR

bash my_script.sh

This script will run 25 times the my_script.sh script changing variables taken in the file.txt file. In other words, 25 jobs will be launched all together, if I submit bash.sh with the command sbatch bash.sh.

Is there a way I can limit the number of jobs to be ran at the same time (e.g. 5) until all 25 will be completed?

And if there is a way in doing so, how can I do the same but with having 24 jobs in total (i.e. not a number divisible by 5)?

Thanks


Solution

  • Extract from Slurm's sbatch documentation:

    -a, --array=<indexes>
    ... A maximum number of simultaneously running tasks from the job array may be specified using a "%" separator. For example "--array=0-15%4" will limit the number of simultaneously running tasks from this job array to 4. ...

    This should limit the number of running jobs to 5 in your array:

    #SBATCH --array=0-24%5