I am just trying to get my head around MPI and can't seem to understand, why the following programs output is different from what I expect.
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
int *sendbuf, *recvbuf;
sendbuf = (int *) malloc(sizeof(int) * size);
recvbuf = (int *) malloc(sizeof(int) * size);
for(int i = 0; i < size; i++) {
sendbuf[i] = rank;
}
for(int i = 0; i < size; i++) {
printf("sendbuf[%d] = %d, rank: %d\n", i, sendbuf[i], rank);
}
MPI_Scatter(sendbuf, 1, MPI_INT,
recvbuf, 1, MPI_INT, rank, MPI_COMM_WORLD);
for(int i = 0; i < size; i++) {
printf("recvbuf[%d] = %d, rank: %d\n", i, recvbuf[i], rank);
}
As far as I understood, MPI_Scatter sends sendcount
values from an array to all processses.
In my example I gave each process an array filled with the own rank number.
Then each process sends one of the indexes in its array to all other processes. With two processes the first procss has an sendbuf array of:
sendbuf[0] = 0
sendbuf[1] = 0
And the second process (rank 1) has an array of size MPI_Comm_size
filled with 1.
The expected output should be:
recvbuf[0] = 0, rank: 0
recvbuf[1] = 1, rank: 0
recvbuf[0] = 0, rank: 1
revcbuf[1] = 1, rank: 1
But instead I get the following output (for two processes):
sendbuf[0] = 0, rank: 0
sendbuf[1] = 0, rank: 0
sendbuf[0] = 1, rank: 1
sendbuf[1] = 1, rank: 1
recvbuf[0] = 0, rank: 0
recvbuf[1] = 32690, rank: 0
recvbuf[0] = 1, rank: 1
recvbuf[1] = 32530, rank: 1
Any help pointing out my mistake is well appreciated.
I am just trying to get my head around MPI and can't seem to understand, why the following programs output is different from what I expect.
The problem lies in the use of MPI_Scatter to accomplish your goal:
Sends data from one process to all other processes in a communicator Synopsis
int MPI_Scatter(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm) Input Parameters
sendbuf address of send buffer (choice, significant only at root)
sendcount number of elements sent to each process (integer, significant only at root) sendtype data type of send buffer elements (significant only at root) (handle)
recvcount number of elements in receive buffer (integer)
recvtype data type of receive buffer elements (handle)
root rank of sending process (integer)
comm communicator (handle)
Every process should call the MPI_Scatter
with the same root, not with a different root (i.e., the process' rank) as you have done:
MPI_Scatter(sendbuf, 1, MPI_INT, recvbuf, 1, MPI_INT, rank, MPI_COMM_WORLD);
^^^^
Therefore, you are misusing the MPI_Scatter
, the purpose of that routine is to "Sends data from one process to all other processes in a communicator". The following image (taken from source) illustrates it best:
Only one root process, which scatters its data across different processes. This routine is, for instance, used when a process has a chunk of data (e.g., an array), and the code performance some operation over that data. You can parallelize the code by splitting the data among the processes, where each process performs the aforementioned operation in parallel on its assigned data chunk. Afterward, you might call MPI_Gather to gather the data from all the processes back to the original process where that data came from.
Then each process sends one of the indexes in its array to all other processes.
For that you can use MPI_Allgather instead, which "Gathers data from all tasks and distribute the combined data to all tasks". The following image (taken from source) illustrates it best:
As you can see, each process will gather the data send by all processes (including itself).
A running example:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char **argv){
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
int *sendbuf = malloc(sizeof(int) * size);
int *recvbuf = malloc(sizeof(int) * size);
for(int i = 0; i < size; i++)
sendbuf[i] = rank;
MPI_Allgather(sendbuf, 1, MPI_INT, recvbuf, 1, MPI_INT, MPI_COMM_WORLD);
for(int i = 0; i < size; i++)
printf("recvbuf[%d] = %d, rank: %d\n", i, recvbuf[i], rank);
MPI_Finalize();
return 0;
}
OUTPUT for two processes:
recvbuf[0] = 0, rank: 0
recvbuf[1] = 1, rank: 0
recvbuf[0] = 0, rank: 1
recvbuf[1] = 1, rank: 1
For your particular case (with the same input size), MPI_Alltoall would also work, to understand the differences between MPI_Allgather
versus MPI_Alltoall
, I recommend you to check this SO thread.