I've been working a bit with OpenMPI, and I'm not getting the expected behavior when requiring ranks from my procs.
I have a simple C program that is supposed to print each proc's rank :
minimal.c :
#include <stdio.h>
#include "mpi.h"
int
main (int argc, char *argv[])
{
unsigned int procs;
unsigned int self;
MPI_Comm com;
/* MPI ini */
MPI_Init (&argc, &argv);
com = MPI_COMM_WORLD;
MPI_Comm_size (com, &procs);
MPI_Comm_rank (com, &self);
printf("My rank is %d\n", self);
/* MPI Finalize */
MPI_Finalize();
return 0;
}
which I compile with :
mpicc minimal.c -o minimal
Now, if I run the following command on my own computer :
mpirun -np 2 minimal
I get the following trace :
$ mpirun -np 2 minimal My rank is 0 My rank is 0
which I found quite disconcerting.
So, I kept on digging the mpirun manual, and ended up printing additional infos with -display-devel-map and -report-bindings, and this the trace I got :
$ mpirun -np 2 -display-devel-map -report-bindings minimal Data for JOB [53858,1] offset 0 Mapper requested: NULL Last mapper: round_robin Mapping policy: BYCORE Ranking policy: SLOT Binding policy: CORE:IF-SUPPORTED Cpu set: NULL PPR: NULL Cpus-per-rank: 1 Num new daemons: 0 New daemon starting vpid INVALID Num nodes: 1 Data for node: UX31A Launch id: -1 State: 2 Daemon: [[53858,0],0] Daemon launched: True Num slots: 2 Slots in use: 2 Oversubscribed: FALSE Num slots allocated: 2 Max slots: 0 Username on node: NULL Num procs: 2 Next node_rank: 2 Data for proc: [[53858,1],0] Pid: 0 Local rank: 0 Node rank: 0 App rank: 0 State: INITIALIZED App_context: 0 Locale: [BB/..] Binding: [BB/..] Data for proc: [[53858,1],1] Pid: 0 Local rank: 1 Node rank: 1 App rank: 1 State: INITIALIZED App_context: 0 Locale: [../BB] Binding: [../BB] [UX31A:04861] MCW rank 1 bound to socket 0[core 1[hwt 0-1]]: [../BB] [UX31A:04861] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/..] My rank is 0 My rank is 0
which left me puzzled.
I am using Ubuntu 16.04 and the OpenMPI packages from the apt repos. My computer is an Asus UX31a.
I'd be very grateful if someone could give me some insight on what is happening here.
Thank you !
I finally found what was going on thanks to Gilles Gouaillardet !
Turns out I had mpich
libs installed along with openmpi
bins !
Here's what I did :
Check which library was used inside my binary :
$ ldd minimal
...
libmpich.so.12 => /usr/lib/x86_64-linux-gnu/libmpich.so.12
...
$ dpkg -S /usr/lib/x86_64-linux-gnu/libmpich.so.12
libmpich12:amd64: /usr/lib/x86_64-linux-gnu/libmpich.so.12.1.0
Check which package provided my mpicc
and mpirun
binaries :
$ which mpirun
/usr/bin/mpirun
$ dpkg -S mpirun
openmpi-bin: /usr/bin/mpirun.openmpi
...
I removed the mpich
packages I had installed
sudo apt-get remove libmpich12 libmpich-dev
I installed the openmpi
libraries I needed
sudo apt-get install libopenmpi-dev
I compiled again once this was done :
$ mpicc minimal.c -o minimal
$ mpirun -np 2 minimal
My rank is 0
My rank is 1
Hurray !