Search code examples
cparallel-processingmpihpcmpi-io

MPI_File_read_at_all giving invalid count argument when trying to read big files


I wrote a simple C program to try out MPI-IO; The program reads a text file and each process outputs the first character of the part it read. The program works fine with different sizes (100KB, 30MB, 500MB, 2.5GB) but when I tried a 7.5GB file, I got this error:

Invalid count, error stack:
MPI_FILE_IREAD_AT(104): Invalid count argument

I tried the collective method (MPI_File_read_at_all) and the independent method (MPI_File_read) and both of them fail to read the 7.5GB file. This is the code responsible for reading:

MPI_File fh;
MPI_Offset total_number_bytes, number_bytes;
long long nchars;
int errclass, resultlen;
char err_buffer[MPI_MAX_ERROR_STRING];    

MPI_File_open(MPI_COMM_WORLD, "bigfastq", MPI_MODE_RDONLY, MPI_INFO_NULL, &fh);
MPI_File_get_size(fh, &total_number_bytes);

number_bytes = total_number_bytes/size;
nchars = number_bytes/sizeof(char);
//char buf[nchars+1];
char *buf = (char*)malloc(sizeof(char)*nchars);
MPI_Offset offset = rank*number_bytes;

int err = MPI_File_read_at_all(fh, offset, buf, nchars, MPI_CHAR, &status);
if(err != MPI_SUCCESS){
    MPI_Error_class(err,&errclass);
    if (errclass== MPI_ERR_COUNT) {
        printf("Invalid count class!!\n");
    }
    MPI_Error_string(err,err_buffer,&resultlen);
    fprintf(stderr,err_buffer);

    MPI_File_close(&fh);
    MPI_Finalize();
    return 0;
}

MPI_File_close(&fh);

printf("rank: %d, buf[%lld]: %c, count: %lld\n", rank, offset, buf[offset], count);


MPI_Finalize();

Any ideas why it is giving this error when trying the 7.5GB file?

Thanks in advance!


Solution

  • MPI_File_read_at_all() fourth parameter (count) is an int. So your long long gets likely truncated to a negative integer.

    You can create a large derived datatype so count fits in a signed integer, or issue several shorter reads.