Search code examples

Do I need a single bash file for each task in SLURM?

I am trying to launch several task in a SLURM-managed cluster, and would like to avoid dealing with dozens of files. Right now, I have 50 tasks (subscripted i, and for simplicity, i is also the input parameter of my program), and for each one a single bash file which indicates the computations configuration, and the srun command:

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1 
#SBATCH --mem=30G

srun python i

I am then using another bash file to submit all these tasks,

for i in {1..50}:
  sbatch slurm_run_$ 

This works (50 jobs are running on the cluster), but I find it troublesome to have more than 50 input files. Searching a solution, I came up with the & command, obtaining something as:


#SBATCH --ntasks=50
#SBATCH --cpus-per-task=1 
#SBATCH -J pltall
#SBATCH --mem=30G

# Running jobs 
srun python 1   &
srun python 2   & 
srun python 49  & 
srun python 50  & 
echo "All done"

Which seems to run as well. However, I cannot manage each of these jobs independently: the output of squeue shows I have a single job (pltall) running on a single node. As there are only 12 cores on each node in the partition I am working in, I am assuming most of my jobs are waiting on the single node I've been allocated to. Setting the -N option doesn't change anything too.. Moreover, I cannot cancel some jobs individually anymore if I realize there's a mistake or something, which sounds problematic to me.

Is my interpretation right, and is there a better way (I guess) than my attempt to process several jobs in slurm without being lost among many files ?


  • What you are looking for is the jobs array feature of Slurm.

    In your case, you would have a single submission file ( like this:

    #SBATCH --ntasks=1
    #SBATCH --cpus-per-task=1 
    #SBATCH -J pltCV
    #SBATCH --mem=30G
    #SBATCH --array=1-50
    srun python ${SLURM_ARRAY_TASK_ID}

    and then submit the array of jobs with


    You will see that you will have 50 jobs submitted. You can cancel all of them at once or one by one. See the man page of sbatch for details.