Search code examples
linuxmmapfile-mapping

linux - map nonlinear parts of file


I have a scenario where I need to map non-linear parts of a file, linearly in a process space.

For example, if file is 10 pages, I may need to map first 3, skip 4, and last 3. The mapping should be linear, s.t. incremental access in process space allows to go to page 8 of file after page 3, as page 4,5,6 & 7 were not mapped.

I want to know if this is possible in Linux.

Thanks.


Solution

  • The strategy to call mmap() multiple times using MAP_FIXED to specify a fixed address for the second and subsequent mappings should work, but the problem is that if there was anything already mapped into the memory immediately after the first mapping, it will get clobbered, because MAP_FIXED automatically unmaps whatever used to be there before making the new mapping.

    I took a look at the layout of some mappings in the address space on a Linux system here, and I observed that, at least some of the time, the addresses chosen by the kernel for memory mappings grow downward from a high address to a low address. That is, a new mapping is given the address space immediately below the address space used by the most recent existing mapping. Under that strategy, when you make your first mapping, it is virtually guaranteed that the address space immediately following that mapping is already occupied by something else (and it's probably something important, too, like a system library). Other systems (different kernel version, different architecture, or non-Linux, etc...) might use different address space allocation strategies that don't make this problem unlikely, but you should assume that it can happen and guard against it by using the following technique.

    1. First make a dummy mapping that is the sum of the size of all of the mappings you want to construct. So if you want to map the first 3 pages of the file, then skip 4, then map three more, make a dummy mapping of 6 pages.

      For this dummy mapping, you can just map anonymous memory (MAP_ANONYMOUS). Thanks to Basile Starynkevitch for the suggestion to also use MAP_NORESERVE for this mapping.

    2. Replace this dummy mapping piece by piece with the mappings of the file you actually want, using MAP_FIXED to specify the precise address you would like each mapping to appear at.

      EDIT: I originally suggested destroying the dummy mapping with munmap() before reusing the address space for new mappings, but thanks to jstine for pointing out that this is unnecessary (and it introduces a race condition if your program is multithreaded).

      For the first mapping, use the start address by the dummy mapping. Calculate the address for the second mapping as the start address of the dummy mapping plus the size of the first mapping. This should place the second mapping right after the end of the first mapping. And so on for the third and fourth mappings. In your scenario, everything is page-sized and page-aligned, so there will be no gaps due to alignment.

    After you finish making all of the mappings in step 2, there should be nothing left of the original dummy mapping.