Search code examples
c++c++11alignmentallocator

std::align and std::aligned_storage for aligned allocation of memory blocks


I'm trying to allocate a block of memory of size size which needs to be Alignment aligned where the size may not be defined at compile time. I know routines such as _aligned_alloc, posix_memalign, _mm_alloc, etc exist but I do not want to use them as they bring down code portability.
C++11 gives a routine std::align and also a class std::aligned_storage from which I can retrieve a POD type to allocate an element which will be aligned to my requirements. However my goal is to create an allocator which would allocate a block of memory of size size (not just a single element) which would be aligned.
Is this possible using std::align? The reason I ask is since std::align moves the pointer, the class using that pointer will give the allocator a pointer to the moved address for deallocation which would be invalid. Is there a way to create an aligned_allocator this way?


Solution

  • EDIT: after clarifications from the OP, it appears the original answer is off-topic; for reference's sake it is kept at the end of this answer.

    Actually, the answer is rather simple: you simply need to keep a pointer both to the storage block and to the first item.

    This does not, actually, requires a stateful allocator (it could be possible even in C++03, albeit with a custom std::align routine). The trick is that the allocator is not required to only ask of the system exactly enough memory to store user data. It can perfectly ask a bit more for book-keeping purposes of its own.

    So, here we go creating an aligned allocator; to keep it simple I'll focus on the allocation/deallocation routines.

    template <typename T>
    class aligned_allocator {
        // Allocates block of memory:
        // - (opt) padding
        // - offset: ptrdiff_t
        // - T * n: T
        // - (opt) padding
    public:
        typedef T* pointer;
        typedef size_t size_type;
    
        pointer allocate(size_type n);
        void deallocate(pointer p, size_type n);
    
    }; // class aligned_allocator
    

    And now the allocation routine. Lots of memory fiddling, it's the heart of the allocator after all!

    template <typename T>
    auto aligned_allocator<T>::allocate(size_type n) -> pointer {
        size_type const alignment = std::max(alignof(ptrdiff_t), alignof(T));
        size_type const object_size = sizeof(ptrdiff_t) + sizeof(T)*n;
        size_type const buffer_size = object_size + alignment;
    
        // block is correctly aligned for `ptrdiff_t` because `std::malloc` returns
        // memory correctly aligned for all built-ins types.
        void* const block = std::malloc(buffer_size);
    
        if (block == nullptr) { throw std::bad_alloc{}; }
    
        // find the start of the body by suitably aligning memory,
        // note that we reserve sufficient space for the header beforehand
        void* storage = reinterpret_cast<char*>(block) + sizeof(ptrdiff_t);
        size_t shift = buffer_size;
    
        void* const body = std::align(alignment, object_size, storage, shift);
    
        // reverse track to find where the offset field starts
        char* const offset = reinterpret_cast<char*>(body) - sizeof(ptrdiff_t);
    
        // store the value of the offset (ie, the result of body - block)
        *reinterpret_cast<ptrdiff_t*>(offset) = sizeof(ptrdiff_t) + shift;
    
        // finally return the start of the body
        return reinterpret_cast<ptrdiff_t>(body);
    } // aligned_allocator<T>::allocate
    

    Fortunately the deallocation routine is much simpler, it just has to read the offset and apply it.

    template <typename T>
    void aligned_allocator<T>::deallocate(pointer p, size_type) {
        // find the offset field
        char const* header = reinterpret_cast<char*>(p) - sizeof(ptrdiff_t);
    
        // read its value
        ptrdiff_t const offset = *reinterpret_cast<ptrdiff_t*>(header);
    
        // apply it to find start of block
        void* const block = reinterpret_cast<char*>(p) - offset;
    
        // finally deallocate
        std::free(block);
    } // aligned_allocator<T>::deallocate
    

    The other routines need not be aware of the memory layout, so writing them is trivial.


    Original answer:

    template <typename T>
    class Block {
    public:
        Block(Block const&) = delete;
        Block& operator=(Block const&) = delete;
    
        explicit Block(size_t n);
        ~Block();
    
    private:
        void* _storage;
        T* _begin;
        T* _end;
    }; // class Block
    
    template <typename T>
    Block<T>::Block(size_t n) {
        size_t const object_size = n * sizeof(T);
        size_t const buffer_size = object_size + alignof(T);
    
        _storage = std::malloc(size);
    
        void* stock = _storage;
        size_t shift = buffer_size;
        std::align(alignof(T), object_size, stock, shift);
    
        _begin = _end = reinterpret_cast<T*>(stock);
    } // Block<T>::Block
    
    template <typename T>
    Block<T>::~Block() {
        for (; _end != _begin; --_end) {
            (_end - 1)->~T();
        }
    
        std::free(_storage);
    } // Block<T>::~Block