Search code examples
pythonmpiipythonmpich

How does MPI allreduce work


I don't have much experience with MPI and I try to understand how allreduce work. Below is a simple example inspired by this IPython tutorial. 2 MPI engines are launched on a local computer from the IPython notebook dasboard here:

In [1]: import numpy as np
        from IPython.parallel import Client

In [2]: c = Client(profile='mpi')

In [3]: view = c[:]

In [4]: view.scatter('a', np.arange(4.))
Out[4]: <AsyncResult: scatter>

In [5]: %%px
        from mpi4py import MPI
        import numpy as np

        print MPI.COMM_WORLD.allreduce(np.sum(a), op=MPI.SUM)
[stdout:0] 1.0
[stdout:1] 5.0

I would have expected each engine to print "6.0", like in the IPython tutorial. Here, it is as if the reduction operation was not performed. It's probably very simple, but I don't quite see what I am doing wrong?

I use:

  • Ubuntu 12.04
  • Python 2.7.3 32-bit
  • IPython 1.1.0
  • mpi4py 1.2.2
  • mpich2

Solution

  • That is the behavior you would see if your engines were not actually started with MPI. Since your engines have no MPI peers, allreduce doesn't do anything - it just returns the value of np.sum(a) on each engine, which is what you are seeing.

    It's a good idea to check that MPI is set up properly:

    %px print MPI.COMM_WORLD.Get_rank(), MPI.COMM_WORLD.Get_size()
    

    If your engines are not in the same MPI world, your output will look like this:

    [stdout:0] 0 1
    [stdout:1] 0 1
    

    And if they are:

    [stdout:0] 0 2
    [stdout:1] 1 2
    

    Make sure you start your engines with MPI. For example:

    ipcluster start --engines MPI
    

    Or add to ipcluster_config.py:

    c.IPClusterEngines.engine_launcher_class = 'MPI'
    

    Or just do it manually without any configuration (this is all the above configuration does anyway):

    mpiexec -n 4 ipengine