Search code examples
rparameter-passingcluster-computinghpcslurm

How to run a job array in R using the rscript command from the command line?


I am wondering how I might be able to run 500 parallel jobs in R using the Rscript function. I currently have an R file that has the header on top:

args <- commandArgs(TRUE)
B <- as.numeric(args[1])
Num.Cores <- as.numeric(args[2])

Outside of the R file, I wish to pass which of the 500 jobs are to be run, which is specified by B. Also, I would like to control the number of cores/CPUs available to each job, Num.Cores.

I am wondering if there is software or guides that can allow this. I currently have a CentOS 7/Linux server and I know one way is to install Slurm. However, it is quite a hassle and I was wondering if there might be a way to execute 500 jobs, with a queue. Thanks.


Solution

  • This is how I would setup on a cluster using SLURM scheduler

    1. slurm sbatch job submission script

      #!/bin/bash
      
      #SBATCH --partition=xxx             ### Partition (like a queue in PBS)
      #SBATCH --job-name=array_example    ### Job Name
      #SBATCH -o jarray.%j.%N.out         ### File in which to store job output/error
      #SBATCH --time=00-00:30:00          ### Wall clock time limit in Days-HH:MM:SS
      #SBATCH --nodes=1                   ### Node count required for the job
      #SBATCH --ntasks=1                  ### Nuber of tasks to be launched per Node
      #SBATCH --cpus-per-task=2           ### Number of threads per task (OMP threads)
      #SBATCH --mail-type=FAIL            ### When to send mail
      #SBATCH --mail-user=xxx@gmail.com
      #SBATCH --get-user-env              ### Import your user environment setup
      #SBATCH --requeue                   ### On failure, requeue for another try
      #SBATCH --verbose                   ### Increase informational messages
      #SBATCH --array=1-500%50            ### Array index | %50: number of simultaneously tasks
      
      echo
      echo "****************************************************************************"
      echo "*                                                                          *"
      echo "********************** sbatch script for array job *************************"
      echo "*                                                                          *"
      echo "****************************************************************************"
      echo
      
      current_dir=${PWD##*/}
      echo "Current dir: $current_dir"
      echo
      pwd
      echo
      
      # First we ensure a clean running environment:
      module purge
      
      # Load R
      module load R/R-3.5.0
      
      ### Initialization
      # Get Array ID
      i=${SLURM_ARRAY_TASK_ID}
      
      # Output file
      outFile="output_parameter_${i}.txt"
      
      # Pass line #i to a R script 
      Rscript --vanilla my_R_script.R ${i} ${outFile}
      
      echo
      echo '******************** FINISHED ***********************'
      echo
      
    2. my_R_script.R that takes arg from the sbatch script

      args <- commandArgs(trailingOnly = TRUE)
      str(args)
      cat(args, sep = "\n")
      
      # test if there is at least one argument: if not, return an error
      if (length(args) == 0) {
        stop("At least one argument must be supplied (input file).\n", call. = FALSE)
      } else if (length(args) == 1) {
        # default output file
        args[2] = "out.txt"
      }
      
      cat("\n")
      print("Hello World !!!")
      
      cat("\n")
      print(paste0("i = ", as.numeric(args[1])))
      print(paste0("outFile = ", args[2]))
      
      ### Parallel:
      # https://hpc.nih.gov/apps/R.html
      # https://github.com/tobigithub/R-parallel/blob/gh-pages/R/code-setups/Install-doSNOW-parallel-DeLuxe.R
      
      # load doSnow and (parallel for CPU info) library
      library(doSNOW)
      library(parallel)   
      
      detectBatchCPUs <- function() { 
          ncores <- as.integer(Sys.getenv("SLURM_CPUS_PER_TASK")) 
          if (is.na(ncores)) { 
              ncores <- as.integer(Sys.getenv("SLURM_JOB_CPUS_PER_NODE")) 
          } 
          if (is.na(ncores)) { 
              return(2) # default
          } 
          return(ncores) 
      }
      
      ncpus <- detectBatchCPUs() 
      # or ncpus <- future::availableCores()
      cat(ncpus, " cores detected.")
      
      cluster = makeCluster(ncpus)
      
      # register the cluster
      registerDoSNOW(cluster)
      
      # get info
      getDoParWorkers(); getDoParName();
      
      ##### insert parallel computation here #####
      
      # stop cluster and remove clients
      stopCluster(cluster); print("Cluster stopped.")
      
      # insert serial backend, otherwise error in repetitive tasks
      registerDoSEQ()
      
      # clean up a bit.
      invisible(gc); remove(ncpus); remove(cluster); 
      
      # END
      

    P.S: if you want to read a parameter file line by line, include the following line in the sbatch script then pass them to my_R_script.R

        ### Parameter file to read 
        parameter_file="parameter_file.txt"
        echo "Parameter file: ${parameter_file}"
        echo
    
        # Read line #i from the parameter file
        PARAMETERS=$(sed "${i}q;d" ${parameter_file})
        echo "Parameters are: ${PARAMETERS}"
        echo
    

    Refs: