Search code examples
fortranmpi

MPI_SEND doesn't wait for MPI_RECV to complete


I am trying out to implement the MPI_BARRIER function by myself using MPI_SEND and MPI_RECV functions in Fortran. I am extremely sorry if this has been asked before and any help is appreciated.

Program - I:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo   
    else 
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Program - II:

! Barrier Function Example Program 

program barrier 
    implicit none 
    include 'mpif.h'

    integer :: rank, nproc, ierr, tag, msg, root 
    integer :: i 

    call mpi_init(ierr)
    call mpi_comm_size(mpi_comm_world, nproc, ierr)
    call mpi_comm_rank(mpi_comm_world, rank, ierr)
    
    call sleep(rank)

    tag = 0 ; msg = 10 ; root = 0
    ! Barrier Function [RHS]
    if(rank == 0) then
        do i = 2, nproc 
            call mpi_send(msg, 1, mpi_int, i - 1, tag, mpi_comm_world, ierr)
        enddo
        do i = 2 , nproc
            call mpi_recv(msg, 1, mpi_int, i - 1, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        enddo
           
    else 
        call mpi_recv(msg, 1, mpi_int, root, mpi_any_tag, mpi_comm_world, mpi_status_ignore, ierr)
        call mpi_send(msg, 1, mpi_int, root, tag, mpi_comm_world, ierr)
    endif 

    write(*,*) "ID: ",rank 

    call mpi_finalize(ierr)
end program barrier

Notice that only the order of MPI_SEND and MPI_RECV functions has changed. However, when executing the programs, Program-I is able to implement the barrier while Program-II is not.

However, to my understanding, MPI_SEND function waits until the message has been MPI_RECVed. What might the issue with the program here?

It seems that MPI_SEND doesn't wait for MPI_RECV to complete.


Solution

  • MPI_Send() returns when the application can safely overwrite the send buffer. This does not imply that the message has been received by the destination process. Depending on the specifics of your MPI implementation and other factors, you might observe that MPI_Send() returns immediately for short messages, but blocks until a corresponding receive is initiated for longer messages. However, you should not rely on this behavior: according to the MPI standard, any program where a blocking send could lead to a deadlock is considered incorrect.

    MPI_Ssend() (note the double "s"), in contrast, does not return until the matching receive operation has been started by the destination MPI process.