Search code examples
c++matrixmpidistributed-computingdistributed-system

Using MPI_Irecv and MPI_Isend in a for loop


I have an issue with MPI_Isend and MPI_Irecv. I am working on an adjacency matrix of a graph, which is distributed row-wise. We can assume each processor contains one row. For each pair of indices (i,j) I need to send and receive 2 integers. Basically, I need to receive some other information from the other rows in order to do computations. I am new in MPI, and here it goes into infinite loop, I am not sure even is it the right way of using MPI_Isend or MPI_Irecvin a for loop, also the place of putting wait.

As an example, Assume we have we have a graph with 6 vertices, and so the adjacency matrix (adjMatrix) will be a 6*6 matrix, we also have a 6*2 matrix for some other information, and finally, we distribute the data among 6 processors. Therefore:

          |0  20 16 0  6  0 |      |0  1|
          |20 0  0  19 0  6 |      |1  1|
addMatrix=|16 0  0  0  12 0 |    M=|2  1|
          |0  19 0  0  0  12|      |3  1|
          |6  0  12 0  0  9 |      |0  0|
          |0  6  0  12 9  0 |      |1  0|

We distribute the matrices as follows:

P0:       |0  20 16 0  6  0 |      |0  1|

P1:       |20 0  0  19 0  6 |      |1  1|

P2:       |16 0  0  0  12 0 |      |2  1|

P3:       |0  19 0  0  0  12|      |3  1|

P4:       |6  0  12 0  0  9 |      |0  0|

P5:       |0  6  0  12 9  0 |      |1  0|

Now, each processor needs to update its portion of adjMatrix. To do so, they need information from some portion of the matrix M, which is in the other processors. For example, in order to P0 updates index (0,1) which is 20, it needs to have access to the row 1 of matrix M which is {1,1}. Therefore:

P1 should send MLocal[0][0]=1 and MLocal[0][1]=1 to P0 in which P0 receives them as M_j0 and M_j1, respectively.

And

P0 should send MLocal[0][0]=0 and MLocal[0][1]=1 to P1 in which P1 receives them as M_j0 and M_j1, respectively.

    for(int i=0;i<rows;i++){
            for (int j=0; j<n; j++)
            {
                int M_j0,M_j1;
                MPI_Isend(&MLocal[i][0], 1, MPI_INT, j, my_rank+i*n+j+0, MPI_COMM_WORLD, &send_request0);
                MPI_Isend(&MLocal[i][1], 1, MPI_INT, j, my_rank+i*n+j+1, MPI_COMM_WORLD, &send_request1);
                MPI_Irecv(&M_j0, 1, MPI_INT, j, my_rank+i*n+j+0, MPI_COMM_WORLD, &recv_request0);
                MPI_Irecv(&M_j1, 1, MPI_INT, j, my_rank+i*n+j+1, MPI_COMM_WORLD, &recv_request1);
                //MPI_Wait(&send_request0, &status);
                //MPI_Wait(&send_request1, &status);
                MPI_Wait(&recv_request0, &status);
                MPI_Wait(&recv_request1, &status);

                 // Do something ...
            }
        }

Then with GillesGouaillardet suggestion, I changed that 4 MPI_Isend and MPI_Irecv to:

    MPI_Sendrecv(&MoatsLocal[i][0], 1, MPI_INT, j, my_rank+i*n+j+0, &M_j0,1, MPI_INT, my_rank, my_rank+i*n+j+0, MPI_COMM_WORLD, &status);
    MPI_Sendrecv(&MoatsLocal[i][1], 1, MPI_INT, j, my_rank+i*n+j+1, &M_j1,1, MPI_INT, my_rank, my_rank+i*n+j+1, MPI_COMM_WORLD, &status);

But still, it goes into an infinite loop.

UPDATE:

I updated the code, some part of the issue was because of the processors ranking and matching the tags. I fixed that part but still, it had deadlock prone, which I think I know where is the problem. And might not be able to solve it. If I had enough number of processors, to distribute each line to a processor, i.e. n=p, it would not be any issues. But the issue is where the number of processors is less than n, then the flow is not nicely through the main diagonal I explain it through the example, let us assume we have 4 processors and n=6. Assume here is the distribution:

P0:       |0  20 16 0  6  0 |      |0  1|

P1:       |20 0  0  19 0  6 |      |1  1|
          |16 0  0  0  12 0 |      |2  1|

P2:       |0  19 0  0  0  12|      |3  1|

P3:       |6  0  12 0  0  9 |      |0  0|
          |0  6  0  12 9  0 |      |1  0|

This is what is happing through the loop.

The first iteration:

P0 send and receive to/from P1 information for (0,1):"20" and wait(done)

P1 send and receive to/from P0 information for (1,0):"20" and wait(done)

P2 send and receive to/from P1 information for (3,1):"19" and wait

P3 send and receive to/from P0 information for (4,1):"6" and wait

Second iteration:

P0 send and receive to/from P1 information for (0,2):"16" and wait

P1 send and receive to/from P2 information for (1,3):"19" and wait(done)

P2 was wainting for P1 (3,1):"19" then just recieved it and done!

P3 is waiting for P0 for (4,1):"6" and wait

Third iteration:

P0 is waiting for P1 for (0,2):"16"

P1 send and receive to/from P3 information for (1,5):"19" and wait

P2 send and receive to/from P3 information for (3,5):"12" and wait

P3 is waiting for P0 for (4,1):"6"

Forth iteration:

P0 is waiting for P1 for (0,2):"16"

P1 is waiting for P3 for (1,5):"19"

P2 is waiting for P3 for (3,5):"12"

P3 is waiting for P0 for (4,1):"6"

Now, all are waiting for each other, I do not think there is any way to solve it. The solution that ptb suggested might work, I will try that one.

Still, any other idea is appreciated!


Solution

  • There are a few issues with the code you posted

    1. Each processor will loop through rows. However In your description, the rows are distributed among the processors so this is probably an error.
    2. The send and recv destination and sources are identical. So if you consider the case when j=0, MPI_Isend(...,j,...) means every rank will send something to the root process. This is followed by a call to MPI_IRecv(...,j,...), MPI_Wait which means that every process will wait for a send from the root process which never comes.
    3. The MPI_SendRecv call has the same fundamental problem

    The challenge is that you need your send and recv calls to match up. One way to do that (not necessarily the most performant) would be to post all your sends via MPI_Isend in a loop and then use MPI_Probe, MPI_Recv to process each ranks recvs (since the number of recvs is the number of sends you know exactly how many). A pseudo code example:

    int send_count = 0;
    for (int j=0; j<n; j++) {
      if (matrix_entry[j] != 0) {
        call MPI_Isend(M_local, 2, MPI_INT, j, 0, ...)
        send_count++;
      }
    }
    while (send_count) {
      MPI_Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, comm, status)
      /* get source from status and then call recv */
      MPI_Recv(M_j01, 2, MPI_INTEGER, status(MPI_SOURCE), ...)
     /* Do something with M_j01 */
     send_count--;
    }