Search code examples
cdynamic-memory-allocationpaging

Looking for a custom memory allocator which allocates from within a large pre-allocated block of memory


I have a memory-heavy application which is supposed to run with low latency and with constant speed, but in practice it has poor performance during the first few seconds of startup. This appears to be because the initial memory accesses triggers page faults which have significant performance implications.

I would like to try preallocating a single large block of memory, paging it all in (via mlock() or just by touching each byte), and then using a custom malloc()/free() implementation to ensure that all further allocations are done from within this block.

I am aware of numerous custom memory allocators (TCMalloc, Hoard, jemalloc, etc) but it is not clear to me whether they can be backed by user-provided memory, or whether they always perform their internal allocations from the OS. Does anyone have any insight or recommendations here?

To be clear, I am not looking for a memory pooling system (which would be for reusing small objects). The custom implementation of malloc()/free() should be able to perform any size allocation while limiting fragmentation of its backing store and following other best practices.

Edit based on comments: I do not expect to make the system faster - I just want to move the slow part (allocation, initial page faults) to the start of the process, and then do the real computation work once the system is 'primed'.

Thanks!


Solution

  • A bit late to the party.

    dlmalloc is one choice that can be backed by pre-allocated memory. You can find it here. You may just need to add some extra definitions in the beginning to force it to use your pre-allocated memory rather than call the system mmap, you can refer to the nice documentation at the beginning of the file.