Search code examples
memory-managementfilesystemsdynamic-memory-allocationallocator

Why do not use log-structured allocator for main memory


I have just learned the log-structued file system. And I am so confused that why not use log-structured as main memory allocator. It can significantly reduces fragment.


Solution

  • A log-structured filesystem uses sequential writes to a circular log for persisting filesystem data and usually handles updates the very same way. At some point, the log-structured filesystem must reclaim unused space (obsolete entries in the log) to make it available for future writes. In a trivial implementation, unused entries could be reclaimed by re-writing the log and skipping unused entries in the process whenever the filesystem runs out of diskspace.

    A memory allocator for native code can do something similar. A trivial implementation would entail a large block of memory and a next pointer that needs to be incremented for the process of allocation. Deallocation would need to mark an entry as freed (could be a flag in the entry or a dedicated free list) or deallocations would need to be restricted in other ways (deallocation in first-in-first-out order, only the whole allocation space can be deallocated). In fact, allocators of that kind are known as "linear allocators" and are in use today. One advantage is allocation performance, another is the simplicity and efficiency of deallocation if it happens in FIFO order or affects the whole allocation space. Stack allocation is one prominent example. JVMs often use linear allocators for object allocation. The Apache webserver uses a variant of that to handle per-request memory allocations.

    Using linear allocators as general-purpose allocators is more problematic because of the difficulty of space reclamation. To reclaim space, it's insufficient to mark entries as free because this could lead to high fragmentation and defeats the advantages of linear allocation (merely having to increment a pointer for the actual task of allocation). So, similarly to the filesystem, the allocation space must be compacted so that it only contains allocated entries and the free space can be used for linear allocation. Compacting requires to move allocated entries around - a process that changes and invalidates their previously known addresses. In native code, locations of references to allocated entries (addresses stored as pointers) are unknown to the allocator. Existing references would have to be patched to make the allocator action transparent to the caller, that's not feasible for a general-purpose allocator like malloc.

    Why is it not feasible? Updating existing references would require the following steps:

    • Suspending all threads to stop all allocation entry mutators (unless a so-called "write barrier" is employed)
    • Scanning registers, stacks, and the heap for pointers to moved objects
      • The exact location of pointers is unknown, data matching a reference might be mistaken for a reference (not a problem in case of a conservative GC like Boehm, for example, that does not perform copying. False positives would simply delay collection), thus memory would potentially be corrupted
      • Pointers are potentially unaligned, so the scan would have to happen with a pointer-sized window advanced on a per-byte basis
      • Pointers might be obfuscated due to strategies like pointer tagging and due to data structures like XOR linked lists
      • Code might rely on the previous pointer value (in contrast to managed code, the value of the reference can be read)
    • Resuming all previously suspended threads

    With RTTI, the required metdata could be provided, but passing pointers to libraries that are outside of your control (e.g. glibc) would still be an issue. So, in a limited scope, this could be implemented. A general-purpose allocator must be usable in all native code scenarios - with a linear allocator that requires moving allocation entries, there are too many constraints that make this infeasible. On top of that, the low-level mechanisms used for stopping threads and establishing write barriers could interfere with similar mechanisms employed by the user of the allocator (e.g. a JVM).

    In case of managed code, however, this is common practice. A copying garbage collector, for example, maintains two allocation spaces - from-space and to-space - to handle compaction. During garbage collection, only referenced allocation entries are copied to the other allocation space. Once done, allocations can be handled in a linear fashion again.

    In specific scenarios, allocators that use strategies from log-structured filesystems are being used already. For general-purpose memory allocation in respect to native code, the fact that moving allocation entries is not feasible means that linear allocators cannot replace more traditional memory allocation strategies.

    Instead of going the linear allocation route for reducing fragmentation, one alternative is using pool allocators, which provide allocation spaces for allocation entries of a fixed size. By limiting the allocation space that way, fragmentation can be reduced. A number of general-purpose allocators use pool allocators for small allocations. In some cases, those application spaces exist on a per-thread basis to eliminate the need for locking and improve CPU cache utilization.