performance parallel-processing mpi wait hpc

What is the difference between MPI_Send() and MPI_Isend() followed by MPI_Wait()?

I am not able to understand the difference between MPI_Send() and MPI_Isend() followed by MPI_Wait().

Isn't that when we use MPI_Wait() after MPI_Isend() we are turning it into a Blocking call? As we must wait till all the elements are copied in the buffer.

I know that this configuration(given below) may lead to a deadlock

  P1--> MPI_Send() MPI_Recv()             
  P2--> MPI_Send() MPI_Recv()

But can this configuration(given below) also lead to deadlock?

 P1--> MPI_Isend() MPI_Wait() MPI_Recv()             
 P2--> MPI_Isend() MPI_Wait() MPI_Recv()

Solution

TL;DR: You need to use the MPI_Wait (or use MPI_Test to test for the completion of the request) to ensure that the message is completed, and that the data in the send/receive buffer can be again safely manipulated.

More detailed answer:

MPI_Isend

Begins a nonblocking send

int MPI_Isend(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm, MPI_Request *request)

Let us imagine that you send an array of ints using MPI_Isend without calling MPI_Wait; in such case you are not really sure when you can safely modify (or deallocate the memory of) that array. The same applies to MPI_Irecv. Nonetheless, calling MPI_Wait ensures that from that point onwards one case read/write (or deallocate the memory of) the buffer without risks of undefined behavior or inconsistent data.

During the MPI_Isend the content of the buffer (e.g., the array of ints) has to be read and sent; like-wise during the MPI_Irecv the content of the receiving buffer has to arrive. In the meantime, one can overlap some computation with the ongoing process, however this computation cannot change (or read) the contend of the send/recv buffer. Then one calls the MPI_Wait to ensure that from that point onwards the data send/recv can be safely read/modified without any issues.

I am not able to understand the difference between MPI_Send() and MPI_Isend() followed by MPI_Wait().

From this SO Thread one can read:

These functions do not return (i.e., they block) until the communication is finished. Simplifying somewhat, this means that the buffer passed to MPI_Send() can be reused, either because MPI saved it somewhere, or because it has been received by the destination.

Finally:

I know that this configuration(given below) may lead to a deadlock

P1--> MPI_Send() MPI_Recv()
P2--> MPI_Send() MPI_Recv()

But can this configuration(given below) also lead to deadlock?

P1--> MPI_Isend() MPI_Wait() MPI_Recv()
P2--> MPI_Isend() MPI_Wait() MPI_Recv()

If the exchange of messages only happens between the processes P1 and P2, then yes. Semantically a call to MPI_Isend() followed by a call to MPI_Wait() is the same as calling MPI_Send().

P1 sends a message to P2 and waits for the completion of that message, however P2 sends a message to P1 and also waits. Since, each process is waiting on each other this leads to a deadlock.