Search code examples
bashparallel-processinghpcslurm

Parallelize bash script in HPC slurm


I have a file like this:

bash scripts/1-Call_HC.sh Uni_Mont_1_3 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_1_3.e 2> logs/HC_Uni_Mont_1_3.o
bash scripts/1-Call_HC.sh Uni_Mont_27_1 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_27_1.e 2> logs/HC_Uni_Mont_27_1.o
bash scripts/1-Call_HC.sh Uni_Mont_27_2 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_27_2.e 2> logs/HC_Uni_Mont_27_2.o
bash scripts/1-Call_HC.sh Uni_Mont_29_1 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_29_1.e 2> logs/HC_Uni_Mont_29_1.o
bash scripts/1-Call_HC.sh Uni_Mont_29_3 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_29_3.e 2> logs/HC_Uni_Mont_29_3.o
bash scripts/1-Call_HC.sh Uni_Mont_30_1 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_30_1.e 2> logs/HC_Uni_Mont_30_1.o
bash scripts/1-Call_HC.sh Uni_Mont_30_2 /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_30_2.e 2> logs/HC_Uni_Mont_30_2.o

In a normal machine I will use nohup parallel -j 3 < file.sh to parallelize the execution. I would like to do the same on a HPC server with Slurm using the queue of the server. How can I do that? Thanks a lot, Denise


Solution

  • A job array should do the job here.

    Write a submission script like this:

    #! /bin/bash
    #SBATCH --ntask=1
    #SBATCH --cpus-per-task=...
    #SBATCH --mem-per-cpu=...
    #SBATCH --array=0-6
    
    OPTIONS=(1_3 27_1 27_2 29_1 29_3 30_1 30_2)
    CURROPT=${OPTIONS[$SLURM_ARRAY_TASK_ID]}
    
    bash scripts/1-Call_HC.sh Uni_Mont_$CURROPT /home/db/fagiolo/config_fagiolo_Pvulgaris.sh > logs/HC_Uni_Mont_$CURROPT.e 2> logs/HC_Uni_Mont_$CURROPT.o
    

    and submit it with sbatch. This will create 6 independent jobs that will be scheduled in parallel.