I have a MPI program that create a file which have time per iteration of certain amount of calculations. When I run this code without submitting to the queue(this cluster runs SGE), it gives following time in seconds. I grabbed 8 processors using mpirun -np8
.
STEP ITIME
-------------
1 0.868128
2 0.426714
3 0.409768
4 0.427312
5 0.412737
6 0.413256
7 0.414480
8 0.414984
9 0.415683
10 0.416826
But when I submit the same amount of work for 8 processors and submit it to the queue, the program take more time for calculation of the iterations. The time per step is almost four times.
STEP ITIME
-------------
1 3.189155
2 1.594365
3 1.600892
4 1.589424
5 1.605402
6 1.589136
7 1.599425
8 1.591966
9 1.601557
10 1.603447
The following bash script was used to submit the job.
#!/bin/sh
#$ -S /bin/bash
#$ -pe orte 8
export PATH=~:$PATH
/opt/openmpi/bin/mpirun -np 8 ./exec
I will appreciate if someone can point me out what might cause this issue?
In your first case (run this code without submitting to the queue), you are probably running 8 processes on the same node. That's usually fine nowadays: you've likely got 8 cores.
Try this out:
$ /opt/openmpi/bin/mpirun -np 8 uname -a
did you get 8 identical lines?
In the SGE case, you might get 8 physical machines, so now there is network communication involved. Confirm as above. I don't know SGE, but your environment no doubt has a "how to assign mpi processes" switch to indicate if you want it to assigndepth first or breadth first.