Search code examples
multithreadingxargsslurmmpiexec

Combining xargs parallel and mpirun


I have an embarassingly parallel (bash) script that is running in a computing cluster. The script is a shell script and is not linked to any MPI library: this means that the only way I can send the MPI rank to it, is with a command line parameter.

So Far, I only executed it within a single node, and the solution was simple:

 #!/bin/bash
 #SBATCH --nodes=1
 N=16
 seq $N | xargs -P $N -I% my_script.bash % $N

How can I scale it with two nodes? If I just use '--nodes=2' and N=32 then xargs will try to spawn all threads on the same node. On the other hand I cannot use mpiexec alone: because the script is not linked to MPI library and I do not know how tell the script which threads it is.


Solution

  • You can use srun within your submission script to do that:

    seq $N | xargs -P $N -I% srun --exclusive -N1 my_script.bash % $N
    

    This will use srun to launch your bash script and distribute it to the allocated CPUs.