Search code examples
c++data-structuresvirtual-memorypage-tables

Using page-table remapping to avoid data-copying during array-reallocation


Say you have a std::vector-like container class, and that vector is already filled to 100% of its capacity with data items, and the calling code calls push_back() to add one more data item.

At the point, the vector will need to reallocate its internal array to hold the additional data item (possibly with some extra space for any further items the calling code might want to use).

In particular, it will have to:

  1. Allocate a new/larger array
  2. Copy/move all the objects from the old array to the new array
  3. Update its internal state to point to the new array
  4. free the old array

... all of which can be rather time-consuming, and also requires enough RAM to hold both the old and the new array in memory simultaneously.

However, most modern computers feature virtual memory, which allows physical pages of RAM to be mapped to virtual addresses. Therefore, it should be possible, at least in principle, to replace the above process with this one:

  1. Allocate a continuous range of virtual addresses large enough to represent the new/larger array.
  2. Map the first N physical pages associated with the existing array to the first N virtual-page-address-blocks of the new virtual-address range.
  3. Unmap the old array's virtual addresses from the physical pages.
  4. Allocate new physical pages for the additional space in the new array (or maybe just set it up so that the additional virtual addresses in the new array will page-fault to demand-allocate the physical pages when they are accessed)

This would have the advantage of not requiring any data-copying at all, so expanding the vector's capacity could be quite efficient. The disadvantage (other than limited portability) would be that any pointers stored within the data would not be updated to the new set of virtual addresses, so this technique would only work for data that doesn't include any pointers into the vector's old array.

My question is, is this technique viable in a userspace program? If so, how would one go about implementing it? Or if not, what prevents it from being done? (i.e. are there any fundamental reasons why it couldn't work, or is it just that no suitable APIs currently exist to do it?)


Solution

  • Yes, it can be done and works reasonably well. Windows gives enough access to page faults in user mode to do it.

    Jeffrey Richter's books on Win32 from around the Windows NT 4 timeframe gave examples of implementing a dynamic array this way. It didn't attempt to be an accurate implementation of std::vector, but showed enough about how to do the dynamic resizing that the rest of a conforming implementation would have been pretty straightforward.

    Unfortunately, at least in when I tested it, I didn't see much speed improvement. With enough care, maybe it would have made a bigger difference, but what I found wasn't particularly encouraging.