I have a CoW region of memory that I need to reset to the original state.
Sadly, MADV_DONTNEED behaves exactly the same as munmap, and is seemingly freeing all pages. munmap is extremely expensive and the performance is horrendous to say the least, and its way cheaper to create a new $thing from scratch using MAP_ANONYMOUS, initialize it manually, then munmap that. That makes zero sense to me, and just shows something is really broken with mmap and CoW mappings. Unfortunately for me I really need CoW. It's that or memcpy from one range to another, and since we are in 2021 I expect that Linux will be able to do copy-on-write.
See: https://kostja.github.io/2012/04/04/1111.html
I would like to only discard the dirty pages of my memory range.
munmap anon: 435110ns (435 micros)
munmap memfd: 21958015ns (21958 micros)
This is the average time when freeing 400x 128MB ranges. If I only free 1 range then I get sane numbers, so there's some kind of bad scaling going on in the kernel that I don't understand. The tmpfs-backed area is untouched after allocating it with MAP_NORESERVE. This is completely insane. Are memfd (tmpfs-backed) files just that slow?
The fastest way ended up being to use the hardware virtualization itself to implement copy-on-write mechanisms. It ended up being extremely complex, fraught with footguns, but most importantly very fast. It is possible to use just a few pages of working memory to call into a copy-on-write VM. Most of the pages are duplicated page table entries.
Additionally, this opens up the possibility for copies of copies, as well as flattening a copy-on-write VM so that it can be used as master.
Linux has no support for this whatsoever.