Search code examples
c++parallel-processingmpiopenmpi

How can I use Remote Memory Access (RMA) functions of MPI to parallelize data aggregation?


I am working on a Monte Carlo code that generates data. More particularly, it generates data that can fall into one of num_data_domains separate data domains. In the end, each domain should contain min_sample_size data points. Here is how the non-parallel code looks like:

int num_data_domains = 10;
std::vector<unsigned long int> counters(num_data_domains, 0);
std::vector<std::vector<double>> data_sets(num_data_domains);

unsigned int min_sample_size = 100;
unsigned int smallest_sample_size = 0;
while(smallest_sample_size < min_sample_size)
{
    double data_point = Generate_Data_Point();
    int data_domain = Identify_Data_Domain(data_point); // returns a number between 0 and data_domains-1
    data_sets[data_domain].push_back(data_point);
    
    counters[data_domain]++;
    
    smallest_sample_size = *std::min_element(std::begin(counters), std::end(counters));

}

Based on the answers to my previous question, I would like to use RMA functions to parallelize this process with MPI. But I cannot get it to work.

Here is my parallelized version of the above code.

int num_data_domains = 10;
std::vector<unsigned long int> counters(num_data_domains, 0);
std::vector<std::vector<double>> data_set(num_data_domains);

MPI_Win mpi_window;
MPI_Win_create(&counters, num_data_domains * sizeof(unsigned long int), sizeof(unsigned long int), MPI_INFO_NULL, MPI_COMM_WORLD, &mpi_window);
int mpi_target_rank         = 0;
unsigned long int increment = 1;

unsigned int min_sample_size = 100;
unsigned int smallest_sample_size = 0;
while(smallest_sample_size < min_sample_size)
{
    double data_point = Generate_Data_Point();
    int data_domain = Identify_Data_Domain(data_point); // returns a number between 0 and data_domains-1
    data_sets[data_domain].push_back(data_point);
    
    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, mpi_window);
    MPI_Accumulate(&increment, 1, MPI_UNSIGNED_LONG, mpi_target_rank, data_domain, 1, MPI_UNSIGNED_LONG, MPI_SUM, mpi_window);
    MPI_Win_unlock(0, mpi_window);

    
    MPI_Win_lock(MPI_LOCK_EXCLUSIVE, 0, 0, mpi_window);
    MPI_Get( &counters , num_data_domains , MPI_UNSIGNED_LONG , mpi_target_rank , 0 , num_data_domains , MPI_UNSIGNED_LONG , mpi_window);
    MPI_Win_unlock(0, mpi_window);

    smallest_sample_size = *std::min_element(std::begin(counters), std::end(counters));

}
MPI_Win_free(&mpi_window);

Here, the counters of MPI process 0 (the "master") are supposed to get updated via MPI_Accumulate(). Here, the fifth argument, data_domain, should be the displacement in the target buffer, i.e. this should make sure that the correct domain counter is incremented. Afterwards, each worker updates its own counters to the remote counters.

However, if I set up the code like this, I obtain a segmentation fault:

[MacBook-Pro:84733] *** Process received signal ***
[MacBook-Pro:84733] Signal: Segmentation fault: 11 (11)
[MacBook-Pro:84733] Signal code: Address not mapped (1)
[MacBook-Pro:84733] Failing at address: 0x9
[MacBook-Pro:84733] [ 0] 0   libsystem_platform.dylib            0x00007fff6fde65fd _sigtramp + 29
[MacBook-Pro:84733] [ 1] 0   ???                                 0x0000000000000000 0x0 + 0
[MacBook-Pro:84733] [ 2] 0   executable                          0x000000010e53c14b main + 1083
[MacBook-Pro:84733] [ 3] 0   libdyld.dylib                       0x00007fff6fbedcc9 start + 1
[MacBook-Pro:84733] [ 4] 0   ???                                 0x0000000000000002 0x0 + 2
[MacBook-Pro:84733] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node MacBook-Pro exited on signal 11 (Segmentation fault: 11).
--------------------------------------------------------------------------

I'm fairly certain that the MPI_Accumulate() causes this error. What am I doing wrong?


Solution

  • The problem is that the address of an std::vector object is not the same as the address of the data you wish to modify. You can see this with the program below.

    When you create a window using counters, you are exposing the memory starting from the address of counters when what you want to expose is the memory starting from the address of counters[0], which is equivalent to counters.data().

    #include <vector>
    #include <iostream>
    
    int main(void)
    {
        int num_data_domains = 10;
        std::vector<unsigned long int> counters(num_data_domains, 0);
    
        std::cout << "address of counters: " << &counters << std::endl;
        std::cout << "address of counters[0]: " << &(counters[0]) << std::endl;
        std::cout << "counters.data(): " << counters.data() << std::endl;
    
        return 0;
    }
    
    $ clang++ stl.cc
    $ ./a.out 
    address of counters: 0x7ffee8604e70
    address of counters[0]: 0x7ffcd1c02b30
    counters.data(): 0x7ffcd1c02b30
    

    If you change the first argument in your MPI program to one of the following, it should work as intended.

    MPI_Win_create(&(counters[0]), num_data_domains * sizeof(unsigned long int),
                   sizeof(unsigned long int), MPI_INFO_NULL, MPI_COMM_WORLD, &mpi_window);
    
    MPI_Win_create(counters.data(), num_data_domains * sizeof(unsigned long int),
                   sizeof(unsigned long int), MPI_INFO_NULL, MPI_COMM_WORLD, &mpi_window);
    

    The .data() method was introduced in C++11 so you might prefer the former.

    The reason your program works when you use a simple array is that counters is trivially identical to &(counters[0]) in this case.