Search code examples
rsshcluster-computingoctavehpc

Error calling R script from octave on HPC cluster


I have one octave script which calls a R script to do some calculation on a HPC cluster. The calculation procedure is as follow:

  1. Submit job on the cluster to get computation node assigned then distribute calculation to each CPUs in that node. Part of the shell script looks like this

    count=0
    
    HOSTLIST=
    
    for host in `cat $PBS_NODEFILE`
    
      do
    
        HOSTLIST[$count]=$host
    
        count=$(($count+1))
    
    done
    
    ...
    ...
    ...
    
    mkdir case_$count
    
    cd case_$count
    
    export workdir=`pwd`
    
    remotehost=${HOSTLIST[$pcount]}
    
    ssh -n $remotehost "cd $workdir; export PATH=$PATH:$workdir; octave $MFILE > /dev/null" &
    
  2. For the sake of simplicity, the sample $MFILE content is

    printf("Calling R script from Octave \n");
    
    system('./hello_world.R');
    
  3. The hello_world.R

    #!/usr/bin/Rscript
    print("Hello World!")
    
  4. Error encountered when run

    sh: ./hellow_world.R: /usr/bin/Rscript: bad interpreter: No such file or directory
    
  5. Some of my environment variables (just in case)

    $ echo $PATH
    
    /usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/cuda/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/usr/java/latest/bin:/opt/maven/bin:/opt/maui/bin:/opt/torque/bin:/opt/torque/sbin:/opt/pvfs2/bin:/opt/rocks/bin:/opt/rocks/sbin
    
    $ which Rscript 
    
    /usr/bin/Rscript
    
    $ which R
    
    /usr/bin/R
    

If I run the $MFILE from command line, it worked ok and printed the desired output just fine. I have tried many solutions I could find on the net to no avail

Anybody knows what went wrong? Thanks for any suggestion!


Solution

  • The problem was R was loaded on login node but wasn't loaded on the compute node. So in the job submission script, there must be a line to load R before doing any calculation. For example:

        module load r/3.4.3
    

    See more here and here