Search code examples
mpicluster-computinglsf

mpi job submission on lsf cluster


I usually process data on the University's cluster. Most jobs done before are based on parallel batch shell (divide job to several batches, then submit them parallel). An example of this shell is shown below:

#! /bin/bash
#BSUB -J model_0001
#BSUB -o z_output_model_0001.o
#BSUB -n 8
#BSUB -e z_output_model_0001.e
#BSUB -q general
#BSUB -W 5:00
#BSUB -B
#BSUB -N
some command

This time, I am testing some mpi job (based on mpi4py). The code has been tested on my laptop working on single task(1 task using 4 processor to run). Now I need to submit multi-task (30) jobs on the cluster (1 task using 8 processor to run). My design is like this: prepare 30 similar shell files above. command in each shell fill is my mpi command (something like "mpiexec -n 8 mycode.py args"). And each shell reserves 8 processors.

I submitted the jobs. But I am not sure if I am doing correctly. It's running but I am not sure if it runs based on mpi. How can I check? Here are 2 more questions:

1) For normal parallel jobs, usually there is a limit number I can reserve for single task -- 16. Above 16, I never succeeded. If I use mpi, can I reserve more? Because mpi is different. Basically I do not need continuous memory.

2) I think there is a priority rule on the cluster. For normal parallel jobs, usually when I reserve more processors for 1 task (say 10 tasks and 16 processors per task), it requires much more waiting time in the queue than reserving less less processors for single task (say divide each task to 8 sub-tasks (80 sub-tasks in total) and 2 processors per sub-task). If I can reserve more processors for mpi. Does it affects this rule? I worry that I am going to wait forever...


Solution

  • Well, increasing "#BSUB -n" is exactly what you need to do. That option tells how many execution "slots" you are reserving. So if you want to run an MPI job with 20 ranks, you need

    #BSUB -n 20
    

    IIRC the execution slots do not need to be allocated on the same node, LSF will allocate slots from as many nodes are required for the request to be satisfied. But it's been a while since I've used LSF, and I currently don't have access to a system using it, so I could be wrong (and it might depend on the local cluster LSF configuration).