Search code examples
c++endiannessmemory-mapped-files

Efficent way of swapping bytes in a memory-mapped file


I managed to parse a large binary file (~8Gb) by reading blocks of data into memory and swapping the big-endian integers by using the functions showed below. However, I am trying to gain more performance by using Boost Memory-Mapped files but I am not able to use the endian_swap functions because the file is opened with read-only mode. Is there any efficient way to swap the bytes without writing the original file? If not, the performance would be affected by the I/O overhead?

inline void endian_swap(unsigned short int& x)
{
  x = (x>>8) |
    (x<<8);
}
inline void endian_swap(unsigned int& x)
{
  x = (x>>24) |
    ((x<<8) & 0x00FF0000) |
    ((x>>8) & 0x0000FF00) |
    (x<<24);
}
inline void endian_swap(unsigned long long int& x)
{
  x = (((unsigned long long int)(x) << 56) | \
      (((unsigned long long int)(x) << 40) & 0xff000000000000ULL) | \
      (((unsigned long long int)(x) << 24) & 0xff0000000000ULL) | \
      (((unsigned long long int)(x) << 8)  & 0xff00000000ULL) | \
      (((unsigned long long int)(x) >> 8)  & 0xff000000ULL) | \
      (((unsigned long long int)(x) >> 24) & 0xff0000ULL) | \
      (((unsigned long long int)(x) >> 40) & 0xff00ULL) | \
      ((unsigned long long int)(x)  >> 56));
}

The code was found on this article. Thank you very much for your time


Solution

  • At least the underlying operating system supports your desired behavior:

       MAP_PRIVATE
                  Create a private copy-on-write mapping.  Updates
                  to the mapping are not visible to other processes
                  mapping the same file, and are not carried through
                  to the underlying file.  It is unspecified whether
                  changes made to the file after the mmap() call are
                  visible in the mapped region.
    

    The priv flag appears to translate into MAP_PRIVATE:

    void* data = 
        ::BOOST_IOSTREAMS_FD_MMAP( 
            const_cast<char*>(p.hint), 
            size_,
            readonly ? PROT_READ : (PROT_READ | PROT_WRITE),
            priv ? MAP_PRIVATE : MAP_SHARED,
            handle_, 
            p.offset );
    if (data == MAP_FAILED)
        cleanup_and_throw("failed mapping file");