The program I’m working on needs to save some information on every item of its input. As the number of items to be processed will be huge, I cannot keep the data in memory (I iterate over the input in a first pass and record the information).
I want to place the extra data into virtual memory, but would like it to go to disk when memory becomes scarce (that’s /when/, not /if/. It will become scarce).
Currently I create a sparse file of (hopefully) appropriate size, mmap
the whole
thing, close and unlike the file afterwards. The reason I unlink the file is that I
don’t need the information as soon as the program quits.
The program can now happily read and write to the mapped region, all is well.
Until the program exists. Then the kernel starts to write all this now useless
data to disk, although the file is no longer open or linked. I first thought
that the kernel would realize that the data is no longer accessible, but
apparently not. So I first included a call to madvise
with MDV_REMOVE
at
the end of the program, and as that didn’t help, I also added MDV_DONTNEED
as
well. Both didn’t help with my problem.
The worst thing is that this blocks my machine as every write any other program makes (like my text editor) has to wait for this long running write to complete.
Is there any way to convince the kernel to not write this data to disk?
Going over the comments, it appears that using the swap is fine for your needs as an alternative to file storage. If that's the case, I think your best bet is to use a file, as you've done, on a tmpfs partition. The best tmpfs partition to use for that purpose is at /dev/shm
.
Just open a file in /dev/shm
, truncate it to the size you need, mmap it and unlink, precisely like you've already done. /dev/shm uses the main memory as it's "backing store", but that will get swapped out if memory is short.
The advantage of using the swap is that no force flush will happen to pages that still fit in memory at the point the program exists. Immediately after, these pages are immediately recognized as unneeded, and discarded. This should solve your problem while still allowing you to resize etc.
It has the extra benefit of requiring almost no change to your current code :-)