I have a complex model written in Matlab. The model was not written by us and is best thought of as a "black box" i.e. in order to fix the relevant problems from the inside would require rewritting the entire model which would take years.
If I have an "embarrassingly parallel" problem I can use an array to submit X variations of the same simulation with the option #SBATCH --array=1-X
. However, clusters normally have a (frustratingly small) limit on the maximum array size.
Whilst using a PBS/TORQUE cluster I have got around this problem by forcing Matlab to run on a single thread, requesting multiple CPUs and then running multiple instances of Matlab in the background. An example submission script is:
#!/bin/bash
<OTHER PBS COMMANDS>
#PBS -l nodes=1:ppn=5,walltime=30:00:00
#PBS -t 1-600
<GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON ARRAY NUMBER>
# define Matlab options
options="-nodesktop -noFigureWindows -nosplash -singleCompThread"
for sub_job in {1..5}
do
<GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON LOOP NUMBER (i.e. sub_job)>
matlab ${options} -r "run_model(${arg1}, ${arg2}, ..., ${argN}); exit" &
done
wait
<TIDY UP AND FINISH COMMANDS>
Can anyone help me do the equivalent on a SLURM cluster?
par
function will not run my model in a parallel loop in Matlab.While Tom's suggestion to use GNU Parallel is a good one, I will attempt to answer the question asked.
If you want to run 5 instances of the matlab
command with the same arguments (for example if they were communicating via MPI) then you would want to ask for --ncpus-per-task=1
, --ntasks=5
and you should preface your matlab
line with srun
and get rid of the loop.
In your case, as each of your 5 calls to matlab
are independent, you want to ask for --ncpus-per-task=5
, --ntasks=1
. This will ensure that you allocate 5 CPU cores per job to do with as you wish. You can preface your matlab
line with srun
if you wish but it will make little difference you are only running one task.
Of course, this is only efficient if each of your 5 matlab
runs take the same amount of time since if one takes much longer then the other 4 CPU cores will be sitting idle, waiting for the fifth to finish.