Search code examples
clinuxmemoryposix

Alias memory in C


If I have a rope-like datastructure such as this:

struct rope {
  struct { char *buf; size_t len; } *segments;
  size_t len;
}

Is there a way to map each segment's buffer contiguously into a virtual memory space, without copying? My usecase is to convert the rope to a string in the most efficient way possible.

Here's an example of how it would be used:

char *s = rope_flatten(r);
printf("%s\n", s);
r.segments[0].buf[4] = 'x';
printf("%s\n", s); // The 5th character is now replaced with 'x'

Needless to say, the buffers (at least, the bit that's mapped) would not be NUL terminated.

I understand this is probably incredibly platform-specific. If there's something POSIX-compliant that would be awesome. If not, Linux is my primary target and I can fall back to malloc and memcpy if it's not supported.


Solution

  • In a perfect world, you could create your rope as usual and remap each segment once again, so you have a second view where they are contiguous. If you edit a segment, no problems, it's mirrored to the other view. If you enlarge a segment, the buffer is transparently updated.

    Couple of problems make this unfeasible though:

    • Granularity with virtual memory isn't bytewise but pagewise, usually 4KiB and they are mapped as a whole. Your code units are probably a tad smaller than that.
    • Adding and removing segments will still require page table modification
    • Having to call into the kernel to update mappings when modifying will easily dwarf any performance benefits you hope to gain.
    • Portability problems: APIs are OS-specific. MMU is required. Funny effects when coupled with VIVT caches

    There are still use cases for mapping multiple views into an address space though, namely when the data changes but not the data structure, like with a ring buffer:

    Normally, you would have to implement your wrap around logic in software, but if you remap the ring buffer's page(s) once more directly after, you got yourself a ring buffer you can just memcpy on. GNU Radio employs this mechanism and has a nice blog post on it: https://www.gnuradio.org/blog/buffers/

    The blog post made me curious and so I reimplemented the mechanism for Linux, Windows and macOS in libvas. Check it out if you like.