Search code examples
c++memorymemory-addressallocator

How does this memory allocator get the address of the block?


So, I'm reading Writing a memory allocator on this website: http://dmitrysoshnikov.com/compilers/writing-a-memory-allocator/

I'm confused by this part, where the helper function retrieves the block's address:

/**
 * Returns the object header.
 */
Block *getHeader(word_t *data) {
  return (Block *)((char *)data + sizeof(std::declval<Block>().data) -
                   sizeof(Block));
}

This is a block:

struct Block {
 
  // -------------------------------------
  // 1. Object header
 
  /**
   * Block size.
   */
  size_t size;
 
  /**
   * Whether this block is currently used.
   */
  bool used;
 
  /**
   * Next block in the list.
   */
  Block *next;
 
  // -------------------------------------
  // 2. User data
 
  /**
   * Payload pointer.
   */
  word_t data[1];
 
};

Could somebody explain the logic behind this? Cheers.


Solution

  • A memory allocator needs to maintain metadata about the size of each allocated block and where the available unallocated memory is. It is common for an allocator to maintain that metadata longside the allocated regions themselves. That is what the example allocator is doing. struct Block Represents an allocated block of memory including its associated metadata. The first three members (size, used, next) are the metadata. data corresponds to the space presented to the allocator's user (see also below).

    In response to an allocation request, the allocator will choose an available block or maybe merge or split blocks to obtain a suitable one, mark it used, and return a pointer to its data member to the caller. Later, it will want to access the metadata that goes with that data pointer again, and that's what the getHeader() function achieves:

    • data is expected to be a pointer to the data member of a struct Block.
    • It is cast to type char * so that the pointer addition is performed in units of bytes rather than word_ts.
    • sizeof(std::declval<Block>().data) - sizeof(Block), which is C++, not C, computes the (negative) offset of the beginning of a struct Block from the beginning of its data member, assuming, unsafely, that there is no trailing padding in struct Block's layout.
    • The pointer addition therefore yields a char * pointing to the address of the beginning of the block, and
    • casting that to Block * yields the wanted pointer.

    A word about the data member: this member is declared as an array of length 1, with the expectation that the allocator user can safely overrun its bounds. This is known as the "struct hack". Although it happens to work in many implementations, it does not conform to any C or C++ standard. Its behavior therefore being undefined, you should not take this as a model to be copied.

    Note also that the accompanying comment is misleading: data is not a pointer, but rather (the first word_t of) the data themselves.