MPI_Reduce C/C++ - Signal: Segmentation Fault (11)

I don't understand well how the MPI_Reduce works with array. I need to do an element wise sum.

To test the MPI_Reduce function I wrote this simple code and it works:

double a[4] = {0,1,2,(double)process_id};
double b[4];
MPI_Reduce(&a, &b, 4, MPI_DOUBLE, MPI_SUM, p-1, MPI_COMM_WORLD);
if(id == p-1) {
    for(int i = 0; i < 4; i++){
        printf("%f, ", b[i]);
    }
}

it prints this:

0.00000, 4.00000, 8.00000, 6.00000

when I run this code with 4 process. It works!

Now I implement my problem. Assuming I use p process, I need to reduce p matrices of dimensions m * n so I rewrite each matrices in form of array

double *a;
double **A;

A = new double*[n];
//code that compute matrix A
a = (double *) malloc(m * n * sizeof(double));
int k = 0;
for(int i = 0; i < m; i++) {
    for(int j = 0; j < n; j++){
        a[k] = A[i][j];
        k++;
    }
}

In this way I have the matrices that I need to reduce in form of array. Now I execute this reduction:

if(id == p-1){
    reduce_storage = (double *) malloc(m * n * sizeof(double));
}

MPI_Reduce(&a, &reduce_storage, m * n, MPI_DOUBLE, MPI_SUM, p-1, MPI_COMM_WORLD);

Array a and reduce_storage are allocated in the same way so the are of the same dimension m * n, the value of count argument of MPI_Reduce. I don't understand why I try to run it return this error:

*** stack smashing detected ***: <unknown> terminated
[EdoardoPC:01104] *** Process received signal ***
[EdoardoPC:01104] Signal: Aborted (6)
[EdoardoPC:01104] Signal code:  (-6)
[EdoardoPC:01104] *** Process received signal ***
[EdoardoPC:01104] Signal: Segmentation fault (11)
[EdoardoPC:01104] Signal code:  (128)
[EdoardoPC:01104] Failing at address: (nil)

Solution

I don't understand well how the MPI_Reduce works with array. I need to do an element wise sum.

From source about MPI_Reduce one can read:

Reduces values on all processes to a single value

int MPI_Reduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)

In your case MPI_Reduce will work as the following image:

(image taken from https://mpitutorial.com/tutorials/mpi-reduce-and-allreduce/)

From the same source one can read:

MPI_Reduce takes an array of input elements on each process and returns an array of output elements to the root process. The output elements contain the reduced result.

Now let look at your problem

To test the MPI_Reduce function I write this simple code and it works:

double a[4] = {0,1,2,(double)process_id};
double b[4];
MPI_Reduce(&a, &b, 4, MPI_DOUBLE, MPI_SUM, p-1, MPI_COMM_WORLD);

All the parameter are correct; &a and &b match const void *sendbuf and void *recvbuf, respectively. The same applies to remaining parameters, namely int, MPI_Datatype, MPI_Op, int and MPI_Comm.

In this context, having a and b or &a and &b, respectively is the "same". The same in the sense that a and &a yield the same memory address. Notwithstanding, there are important differences between using a and &a, for an in-depth explanation read the following difference between “array” and “&array”.

Array a and reduce_storage are allocated in the same way so the are of the same dimension m * n, the value of count argument of MPI_Reduce. I don't understand why I try to run it return this error:

In the second call

MPI_Reduce(&a, &reduce_storage, m * n, MPI_DOUBLE, MPI_SUM, p-1, MPI_COMM_WORLD);

The argument a and reduce_storage are now both of type double* and you are passing &a and &reduce_storage as argument of the MPI_Reduce. This is wrong because &a and &reduce_storage will return the address of the variable a, and reduce_storage, respectively, which would be a pointer to a pointer-to-double.

Assuming I use p process, I need to reduce p

Side note: Using the 'p' as the total number of sizes is a little be confusion, a better name IMO would be total_processes, number_of_processes or something along those lines.