Search code examples
mpichpbstorquempiexec

MPICH stop running across more than one node


I have a MPI fortran application using MPICH that can be launched/run without problem if I use:

mpiexec -n 16 -f $PBS_NODEFILE   $PBS_O_WORKDIR/myMODEL.a

In the above example I am asking 2 nodes, once each node on the cluster has 8 cpu.

The problem is that my /home are NFS mounted on the compute nodes through the head node and i/o to these disks is very slow. Furthermore, my application has a lot of i/o and from experience, excessive i/o to NFS mounted disk to head node can lock up the head node (this is bad), and it can become completely unresponsive.

The cluster system has a disk that is locally mounted for each JOB on each node (I can use the environmental variable TMPDIR to reach this directory) and so my job need to run under this disk. Knowing this, my strategy is very simple:

  1. Move the files from /home to the $TMPDIR
  2. Start the simulation at $TMPDIR
  3. After the model stops, get the outputs from the application back to /home

If I do all steps above, asking for the cluster system (PBS/Torque) just one node, there is no problem.

 #!/bin/csh

 #PBS -N TESTE
 #PBS -o stdout_file.out
 #PBS -e stderr_file.err
 #PBS -l walltime=00:01:00
 #PBS -q debug
 #PBS -l mem=512mb
 #PBS -l nodes=1:ppn=8

 set NCPU        = `wc -l < $PBS_NODEFILE`
 set NNODES      = `uniq $PBS_NODEFILE | wc -l`

 cd $TMPDIR
 cp $PBS_O_WORKDIR/myMODEL.a ./myMODEL.a
 mpiexec -n $NCPU -f $PBS_NODEFILE   ./myMODEL.a

But if I ask more then one node

 #!/bin/csh

 #PBS -N TESTE
 #PBS -o stdout_file.out
 #PBS -e stderr_file.err
 #PBS -l walltime=00:01:00
 #PBS -q debug
 #PBS -l mem=512mb
 #PBS -l nodes=2:ppn=8

 set NCPU        = `wc -l < $PBS_NODEFILE`
 set NNODES      = `uniq $PBS_NODEFILE | wc -l`

 cd $TMPDIR
 cp $PBS_O_WORKDIR/myMODEL.a ./myMODEL.a
 mpiexec -n $NCPU -f $PBS_NODEFILE   ./myMODEL.a

I got the following error:

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:1@compute-4-5.local] HYDU_create_process (/tmp/mvapich2-1.8.1/src/pm/hydra/utils/launch/launch.c:69): execvp error on file /state/partition1/74127.beach.colorado.edu/myMODEL.a (No such file or directory)

[proxy:0:0@compute-0-1.local] HYD_pmcd_pmip_control_cmd_cb (/tmp/mvapich2-1.8.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:955): assert (!closed) failed

[proxy:0:0@compute-0-1.local] HYDT_dmxu_poll_wait_for_event (/tmp/mvapich2-1.8.1/src/pm/hydra/tools/demux/demux_poll.c:77): callback returned error status

[proxy:0:0@compute-0-1.local] main (/tmp/mvapich2-1.8.1/src/pm/hydra/pm/pmiserv/pmip.c:226): demux engine error waiting for event

[mpiexec@compute-0-1.local] HYDT_bscu_wait_for_completion (/tmp/mvapich2-1.8.1/src/pm/hydra/tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated badly; aborting

[mpiexec@compute-0-1.local] HYDT_bsci_wait_for_completion (/tmp/mvapich2-1.8.1/src/pm/hydra/tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion

[mpiexec@compute-0-1.local] HYD_pmci_wait_for_completion (/tmp/mvapich2-1.8.1/src/pm/hydra/pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for completion

[mpiexec@compute-0-1.local] main (/tmp/mvapich2-1.8.1/src/pm/hydra/ui/mpich/mpiexec.c:405): process manager error waiting for completion

What am I doing wrong?


Solution

  • Looks like when mvapich is starting the processes on the second node it is not finding your executable. Try adding the following before your mpiexec to copy your executable and anything else you need to the node scratch directories. I'm not a csh user so you may be able to do this better.

    foreach n ( `uniq $PBS_NODEFILE` )
        scp $PBS_O_WORKDIR/myMODEL.a $n:$TMPDIR
    end