Is it safe to call MPI_Finalize immediately after MPI_Send or MPI_Ssend?

Suppose I have 3 nodes with ranks 0, 1 and 2. 1 and 2 are doing some calculations and then transfer the results to 0. Is it safe to call MPI_Send or MPI_Ssend, then immediately MPI_Finalize and then immediately return from main?

Consider this example: 2 fishishes much faster, sends the results to 0 and finalizes. After that, 0 and 1 will still need to communicate, once 1 has finished. Is this an allowed MPI "state"? The standard repeatedly states that no communication via MPI may occur after MPI_Finalize has been called. However, it seems a little impractical if that counts for all nodes and not only the ones that have finalized.

Furthermore, it is unclear to me, whether the MPI_Send and MPI_Ssend calls are safe, since at least for MPI_Send, the call may return before 0 posted a receive. What happens if MPI_Finalize is called immediately after that, if there is still data in some internal buffer? Does hat also apply to MPI_Ssend?

Solution

MPI_FINALIZE is collective over all connected ranks but, as with all other collectives except MPI_BARRIER, it may return early. When you call it, the MPI library cleans up after itself and, if possible, returns control back to you. The standard does not require that behaviour though - an implementation that waits until all the other ranks have called MPI_FINALIZE, or simply never returns (except in rank 0 of MPI_COMM_WORLD), is still a compliant one.

As long as the rest of the MPI ranks do not try to communicate with a rank that has called MPI_FINALIZE, they are free to continue communicating with each other. This precludes though any collective calls unless those are on communicators that the exited rank wasn't a member of. The scenario you are describing is a perfectly valid use of MPI.

Regarding what happens when you call MPI_FINALIZE, the standard requires that you take measures to complete the visible part of the communication operations, e.g., wait/test any non-blocking calls. As for buffered operations, the standard says (Section 8.7 Startup, pg. 359):

Advice to implementors. Even though a process has executed all MPI calls needed to complete the communications it is involved with, such communication may not yet be completed from the viewpoint of the underlying MPI system. For example, a blocking send may have returned, even though the data is still buffered at the sender in an MPI buffer; an MPI process may receive a cancel request for a message it has completed receiving. The MPI implementation must ensure that a process has completed any involvement in MPI communication before MPI_FINALIZE returns. Thus, if a process exits after the call to MPI_FINALIZE, this will not cause an ongoing communication to fail. The MPI implementation should also complete freeing all objects marked for deletion by MPI calls that freed them. (End of advice to implementors.)

(emphasis in bold is mine)