Search code examples
c++windowsmmapstreambufmemory-mapping

Reading integers from a memory mapped formatted file


I have memory mapped a large formatted (text) file containing one integer per line like so:

123
345
34324
3232
...

So, I have a pointer to the memory at the first byte and also a pointer to the memory at the last byte. I am trying to read all those integers into an array as fast as possible. Initially I created a specialized std::streambuf class to work with std::istream to read from that memory but it seem to be relatively slow.

Do you have any suggestion on how to efficiently parse a string like "1231232\r\n123123\r\n123\r\n1231\r\n2387897..." into an array {1231232,123123,1231,231,2387897,...} ?

The number of integers in the file is not known beforehand.


Solution

  • std::vector<int> array;
    char * p = ...; // start of memory mapped block
    while ( not end of memory block )
    {
        array.push_back(static_cast<int>(strtol(p, &p, 10)));
        while (not end of memory block && !isdigit(*p))
            ++p;
    }
    

    This code is a little unsafe since there's no guarantee that strtol will stop at the end of the memory mapped block, but it's a start. Should go very fast even with additional checking added.