Are implementations allowed to have new
and/or malloc
allocate far more memory than requested, so it can avoid overhead for later small allocations?
In my experience, no one ever allocates single objects on the heap due to how costly it is, usually writing small object allocators or simply creating large arrays where possible. So an implementation doing this for the programmer feels like it should be an easy ergonomics/performance feature to implement.
Do compilers already do this, or does the standard or another issue prevent this?
Most operating systems [citation needed] manage memory in chunks, usually called "pages". This is an artifact of the underlying hardware.
It is long-established practice that a library's malloc
(and, by extension, new
) would satisfy a user's request for memory by allocating one or more "pages" of memory from the operating system, and then parcel out that memory to the user.(*) Subsequent requests that could be satisfied without having to request more pages from the OS would be satisfied that way.
The gory details vary from system to system and from allocator to allocator. They usually attempt to strike a balance between speed (of allocations / deallocations) and efficiency (of minimal memory usage).
Also traditional is that applications that have specific memory requirements (speed, efficiency) do malloc
a big chunk of memory in one go, and then do the management of that memory on their own. This adds complexity and more chances for bugs (e.g. memory allocated through the application's memory management but free()
d, or memory malloc()
ed but freed through the application memory management, or an error in the application's memory management itself), but allows the application to control the algorithms used.
C++ makes this easier through allocators, which effectively "outsource" a container's memory management to a different class, allowing to employ customized, re-usable memory management classes.
So:
The corollary to 3. is, of course, the old truism measure, optimize, measure. (Don't try to optimize away a problem you do not have, and if you do, make sure your optimization actually improved things instead of making them worse.)
(*) The hardware that introduces the concept of "pages" is the same that does protect separate application's memory spaces from each other -- the Memory Management Unit. To avoid applications subverting that protection, only the operating system is allowed to modify the allocation of memory. Terminology and architecture differs, but there is usually some kind of "supervisor mode" that is only available to the OS kernel, so an application has to trigger the kernel, which then does the allocation, and then returns control to the application.
This is called a "context switch", and in terms of CPU time, it's among the most expensive operations there are. So from the very beginning, library implementors looked for ways to minimize context switches. That's why malloc
and new
are usually rather well-optimized for general usage already.