Window control for mmapped large file(linux, mmap)

How can we control the window in RSS when mapping a large file? Now let me explain what i mean. For example, we have a large file that exceeds RAM by several times, we do shared memory mmaping for several processes, if we access some object whose virtual address is located in this mapped memory and catch a page fault, then reading from disk, the sub-question is, will the opposite happen if we no longer use the given object? If this happens like an LRU, then what is the size of the LRU and how to control it? How is page cache involved in this case?

RSS graph

This is the RSS graph on testing instance(2 thread, 8 GB RAM) for 80 GB tar file. Where does this value of 3800 MB come from and stay stable when I run through the file after it has been mapped? How can I control it (or advise the kernel to control it)?

Solution

As long as you're not taking explicit action to lock the pages in memory, they should eventually be swapped back out automatically. The kernel basically uses a memory pressure heuristic to decide how much of physical memory to devote to swapped-in pages, and frequently rebalances as needed.

If you want to take a more active role in controlling this process, have a look at the madvise() system call.

This allows you to tweak the paging algorithm for your mmap, with actions like:

MADV_FREE (since Linux 4.5)
- The application no longer requires the pages in the range specified by addr and len. The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs. ...
MADV_COLD (since Linux 5.4)
- Deactivate a given range of pages. This will make the pages a more probable reclaim target should there be a memory pressure.
MADV_SEQUENTIAL
- Expect page references in sequential order. (Hence, pages in the given range can be aggressively read ahead, and may be freed soon after they are accessed.)
MADV_WILLNEED
- Expect access in the near future. (Hence, it might be a good idea to read some pages ahead.)
MADV_DONTNEED
- Do not expect access in the near future. (For the time being, the application is finished with the given range, so the kernel can free resources associated with it.) ...

Issuing an madvise(MADV_SEQUENTIAL) after creating the mmap might be sufficient to get acceptable behavior. If not, you could also intersperse some MADV_WILLNEED/MADV_DONTNEED access hints (and/or MADV_FREE/MADV_COLD) during the traversal as you pass groups of pages.