MPI: How to get one process to terminate all others - python -> fortran

I have some MPI-enabled python MCMC sampling code that fires off parallel likelihood calls to separate cores. Because it's (necessarily - don't ask) rejection sampling, I only need one of the np samples to be successful to begin the next iteration, and have quite happily achieved a ~ np x speed-up by this method in the past.

I have applied this to a new problem where the likelihood calls a f2py-wrapped fortran subroutine. In this case, on each iteration the other np-1 processes wait for the slowest (sometimes very slow) result to come back in even if one of those np-1 is already acceptable.

So I suspect I need to pass a message to all non-winning (in speed terms) processes to terminate so that the next iteration can begin, and I need to get clear on some details of the best way to do this, as below.

The python code goes something like this. The sampler is PyMultiNEST.

from mpi4py import MPI
world=MPI.COMM_WORLD

def myloglike(parameters,data,noise):

    modelDataRealisation,status=call_fortran_sub(parameters)

    if status == 0: # Model generated OK
        winner=world.rank # This is the rank of the current winner
        # I want to pass a message to the other still-running processes
        # identifying that a successful sample has come back
        won=world.bcast(winner,root=winner)
   # I tried receiving the message here but the fortran_sub doesn't know
   # anything about this - need to go deeper - see below

   # Calculate chisq value etc.
   loglike = f(data,modelDataRealisation,noise)
   return loglike

Should the broadcast go via the master process?

Now, the tricky part is how to receive the kill signal in the F90 code. Presumably if the code is always listening out (while loop?) it will slow down a lot - but should I anyway be using something like:

call MPI_RECV(winner,1,MPI_DOUBLE_PRECISION,MPI_ANY_SOURCE,MPI_ANY_TAG&
         &,MPI_COMM_WORLD,0,0)

And then how to best to kill that process once the message has been received?

Finally, do I need to do anything in the F code to make the next iteration restart OK/spawn new processes?

Thanks!

Solution

What you are trying to do is not exactly textbook MPI, so I don't have a textbook answer for you. It sounds like you do not know how long a "bad" result will take.

You ask "Presumably if the code is always listening out (while loop?) it will slow down a lot" -- but if you are using non-blocking sends and receives, you can do work for, say, 100 iterations and then test for a "stop work" message.

I would avoid MPI_Bcast here, as that's not exactly what you want. One process wins. That process should then send a "i won!" message to everyone else. Yes, you are doing n-1 point-to-point operations, which is going to be a headache when you have a million mpi processes.

On the worker side, MPI_Irecv with ANY_SOURCE will match any processes "i won!" message. Periodically test for completion.