I installed mpich on an Odroid n2+ and a Raspberry Pi. The Odroid n2+ is a more powerful board that I want to run the majority of my program, and I have the Raspberry Pi as a separate server in order to communicate with the program. I have installed mpich on both boards using the command apt install mpich
. The Odroid board is running Ubuntu 20.04.4 LTS
while the raspberry pi is running 2020-08-20-raspios-buster-armhf-lite
. On both I ran the code posted at example mpi send recieve
If I run the code explicitly on the Odroid board, it works fine, if I run it on the raspberry pi I get an exit code 11. Therefore, I reinstalled mpich on both boards manually from https://www.mpich.org/static/downloads/3.3/mpich-3.3.tar.gz
When this happens, and I compile the code, it works on both boards individually. I can also run the command mpirun -f machinefile -n 2 hostname
and it prints the hostname of each board. This tells me that I setup ssh correctly and mpich is able to login to both boards. However, when I run the code above, that sends and receives messages, it pauses when MPI_Ssend
and MPI_Wait(&request, MPI_STATUS_IGNORE);
are called in the code.
I compiled it with the commands mpic++ test_sendRecv.cpp -Wall -Werror -o test_sendRecv
, then ran using mpirun -f machinefile -n 2 ./test_sendRecv
. If I use mpiexec
instead of mpirun
I get the same issue.
If I am correct, this means that the boards are unable to pass messages between each other?
Is there a way to remedy this?
I had this same issue with other boards. I never played around with it enough to really figure out what was causing the problem or come up with a solution. However, I did create a repository that can use sockets as a more pseudo-MPI. They only pass string messages, but I'm sure it would be pretty straight forward to add extra types, and even file passing. I put the repository here