Search code examples
bashslurmhpc

run job script with different variables found in other files


Let's suppose I have the following list of variables in a txt file (var.txt):

AAA
ABC
BBB
CCC

the following R script (script.R), where x is one variable in var.txt:

print(x)

and the following HPC slurm job script (job.sh):

#!/bin/bash
#SBATCH --job-name test
#SBATCH --ntasks 8
#SBATCH --time 04:00
#SBATCH --output out
#SBATCH --error err

Rscript script.R

How can I run the job.sh script 4 times in sequence, each time with a different variable inside script.R?

Expected output: 4 slurm jobs with script.R printing AAA, ABC, BBB, and CCC.


Solution

  • This is the typical workload suited for a job array. With a submission script like this

    #!/bin/bash
    #SBATCH --job-name test
    #SBATCH --ntasks 8
    #SBATCH --time 04:00
    #SBATCH --output out
    #SBATCH --error err
    #SBATCH --array=0-3
    
    readarray -t VARS < var.txt
    VAR=${VARS[$SLURM_ARRAY_TASK_ID]}
    export VAR
    
    Rscript script.R
    

    and script.R being

    print(Sys.getenv("VAR"))
    

    you will get a four-jobs job array, each one running the R script with a different value of the env var VAR, taken from the var.txt file.