Consider the following simple program which writes the rank of all processes whos rank is bigger than zero into a file:
#include <mpi.h>
int main() {
MPI_Init(NULL, NULL);
int world_rank, world_size;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Offset offset;
MPI_Status status;
MPI_File fh;
MPI_File_open(MPI_COMM_WORLD, "myfile", MPI_MODE_CREATE | MPI_MODE_WRONLY,
MPI_INFO_NULL, &fh);
offset = world_rank * sizeof(int);
if (world_rank > 0) {
MPI_File_write_at(fh, offset, &world_rank, 1, MPI_INT, &status);
}
MPI_File_close(&fh);
MPI_Finalize();
return 0;
}
we compiled and run it on 4 processes
mpic++ main.cpp
mpirun --oversubscribe -n 4 a.out
We check the written file with hexdump -C myfile
00000000 00 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00 |................|
00000010
Now, I never made a write call to the first integer aka the first 4 bytes but they are zero.
Can I be sure that those are always zero?
MPI_File_write_at
will write the data into the file as binary data
. Then when you run the command hexdump -C myfile
that command will display the data, accordingly, with the first 4 bytes
being the offset. Those, first bytes are not part of the binary data per si, but rather added by hexdump -C myfile
for readability purposes.
The hexadecimal 00000010
represents 10000
in binary and 16
in decimal. If you look at your first line, ignoring the first 4 bytes,:
4 bytes 4 bytes 4 bytes 4 bytes 4 bytes
00000000 | 00 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00
00000010 |
You have 16
(4x4) bytes, hence why the next row starts with 00000010
.
Can I be sure that those are always zero?
As far as the standard is concerned, I have not found there, explicitly stating that if one skips the beginning of the file offset > 0
, that the MPI implementation will fill that gap with zeros. For instance, with the MPI version (Open MPI 1.8.8) that I am using, if I modify your code to:
if (world_rank == 3) {
MPI_File_write_at(fh, offset, &world_rank, 1, MPI_INT, &status);
}
I get the following output from hexdump -C myfile
:
00000000 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 00 |................|
00000010
So with the MPI version that I am using, and apparently with yours, it initializes with zeros.
Nonetheless, unless a reliable source can be found (which I did not managed to find one) that explicitly states that in your case the first 4 bytes will always be zero, I would advice to not make any assumptions on that regard. Notwithstanding, one should not care about the file content part that was not written by processes, anyway.
EDIT: A clarification from "the Open MPI mailing list":
In general, the contents of a file written by the MPI IO interface are going to be implementation-specific.