In the specific problem I'm dealing with, the processes arranged in a 3D topology have to exchange portions of a 3D array A(:,:,:)
with each other. In particular, each one has to send a given number of slices of A
to the processes in the six oriented directions (e.g. A(nx-1:nx,:,:)
to the process in the positive 1st dimension, A(1:3,:,:)
in the negative one, A(:,ny-3:ny,:)
in the positive y-dimension, and so on).
In order to do so I'm going to define a set of subarray types (by means of MPI_TYPE_CREATE_SUBARRAY
) to be used in communications (maybe MPI_NEIGHBOR_ALLTOALL
, or its V
or W
extension). The question is about what the better choice, in terms of performance, between:
Finally, to be more general, as in the title, is it better to define more "basic" MPI derived data types and use counts
greater than 1 in the communications, or to define "bigger" types and and use counts = 1
in the communications?
MPI derived datatypes are defined to provide the library a means of packing and unpacking the data you send.
For basic types (MPI_INT, MPI_DOUBLE, etc.) there's no problem since the data in memory is already contiguous: there are no holes in memory.
More complex types such as multidimensional arrays or structures, sending the data as is may be inefficient due to the fact that you are probably sending useless data. For this reason, data is packed into a contiguous array of bytes, sent to the destination and then unpacked again to restore its original shape.
That being said, you need to create a derived datatype for each different shape in memory. For example, A(1:3,:,:)
and A(nx-2:nx,:,:)
represent the same datatype. But A(nx-2:nx,:,:)
and A(:,nx-2:nx,:)
don't. If you specify correctly the stride access (the gap between consecutive datatypes), you can even specify a 2D derived datatype and then vary the count
argument to get better flexibility of your program.
Finally, to answer your last question, this probably worths benchmarking, although I think the difference will not be very noticeable, since it results in a single MPI message in both cases.