Search code examples
cmpimatrix-multiplicationopenmpi

Mpi_recv hangs while waiting for a message that has been sent from slave process


So I'm trying to write myself a matrix multiplication that uses MPI (OpenMPI implementation). The problem is,although sending parts of a matrix from master to slaves by MPI_Send/MPI_Recv works properly, MPI_Recv in the master process that's supposed to receive answers from slaves (marked by //!!! comment) waits indifinitely, never receiving any answer.

However, I can see that slave processes are sending answers (debugMessage in line 167).

To make the question clear, the code can be found there: http://pastebin.com/ZY9jQXDD

So, anybody knows where the problem lies, and could please help me?


Solution

  • Your problem is as simple as mismatched tag values. The master process expects messages with tag value of 0:

    MPI_Recv(&ans, sizeof(answer),MPI_BYTE,MPI_ANY_SOURCE,0,
             MPI_COMM_WORLD,MPI_STATUS_IGNORE );          |
    // ------------------- tag = 0 -----------------------+
    

    while the worker processes are sending messages with tag RESULT, which happens to be defined as 1. Put the proper tag in the master's receive call or use MPI_ANY_TAG if the workers can send messages with various tags.

    Gratuitous advice: sending structures using MPI_BYTE is extremely anti-MPI and very non-portable style. Construct a derived datatype with MPI_Type_create_struct in order to send structures in a portable way.