Search code examples
c++arraysmemory-managementc++17type-punning

type-punning with std::aligned_alloc for array of objects


There is already a lot of posts about strict aliasing rule and type-punning but I couldn't find an explanation that I could understand regarding array of objects. My goal is to have a memory pool non-template class that is used to store arrays of objects. Basically I need to know the actual type only at access time: it can be seen as a non template vector whose iterators would be template. The design I thought of rises several questions so I will try to split them into several SO questions.

My question (which is the third one, see below) is merely how to implement the DefaultAllocate() function in C++17 with std::aligned_alloc (and without std::align_storage which is doomed to deprecation).

#include <cassert>
#include <iostream>
#include <type_traits>

// type that support initialisation from a single double value
using test_t = float;

// just for the sake of the example: p points to at least a sequence of 3 test_t
void load(test_t* p) {
    std::cout << "starting load\n";
    p[0] = static_cast<test_t>(3.14);
    p[1] = static_cast<test_t>(31.4);
    p[2] = static_cast<test_t>(314.);
    std::cout << "ending load\n";
}

// type-punning buffer
// holds a non-typed buffer (actually a char*) that can be used to store any
// types, according to user needs
struct Buffer {
    // buffer address
    char* p = nullptr;
    // number of stored elements
    size_t n = 0;
    // buffer size in bytes
    size_t s = 0;
    // allocates a char buffer large enough for N object of type T and
    // default-construct them
    // calling it on a previously allocated buffer without adequate call to
    // Deallocate is UB
    template <typename T>
    T* DefaultAllocate(const size_t N) {
        size_t RequiredSize =
            sizeof(std::aligned_storage_t<sizeof(T), alignof(T)>) * N;
        n = N;
        T* tmp;
        if (s < RequiredSize) {
            if (p) {
                delete[] p;
            }
            s = RequiredSize;
            std::cout << "Requiring " << RequiredSize << " bytes of storage\n";
            p = new char[s];
            // placement array default construction
            tmp = new (p) T[N];
            // T* tmp = reinterpret_cast<T*>(p);
            // // optional for arithmetic types and also for trivially
            // destructible
            // // types when we don't care about default values
            // for (size_t i = 0; i < n; ++i) {
            //     new (tmp + i) T();
            // }
        } else {
            // placement array default construction
            tmp = new (p) T[N];
            // T* tmp = reinterpret_cast<T*>(p);
            // // optional for arithmetic types and also for trivially
            // destructible
            // // types when we don't care about default values
            // for (size_t i = 0; i < n; ++i) {
            //     new (tmp + i) T();
            // }
        }
        return tmp;
    }
    // deallocate objects in buffer but not the buffer itself
    template <typename T>
    void Deallocate() {
        T* tmp = reinterpret_cast<T*>(p);
        // Delete elements in reverse order of creation
        // optional for default destructible types
        for (size_t i = 0; i < n; ++i) {
            tmp[n - 1 - i].~T();
        }
        n = 0;
    }
    ~Buffer() {
        if (p) {
            delete[] p;
        }
    }
};

int main() {
    constexpr std::size_t N = 3;
    Buffer B;
    test_t* fb = B.DefaultAllocate<test_t>(N);
    load(fb);
    std::cout << fb[0] << '\n';
    std::cout << fb[1] << '\n';
    std::cout << fb[2] << '\n';
    std::cout << alignof(test_t) << '\t' << sizeof(test_t) << '\n';
    B.Deallocate<test_t>();
    return 0;
}

Live
Live more complex

link to question 1
link to question 2

[EDIT] this answer to this question shows that my C++14 snippet above might not be properly aligned: here is a proposed better version inspired from the referenced answer.
see also question 1 for some additional material.


Solution

  • how to implement the DefaultAllocate() function in C++17 with std::aligned_alloc

    • First, I'd make sure that the allocator actually releases memory when it goes out of scope. For that, I'd use a std::function<void()> in which you can store the loop calling the destructors for elements stored in the Buffer. This has the benefit that you don't need to know the type when you later want to deallocate. It's stored in the functor.
    • When checking if we need to allocate new memory we need to ...
      • check if the currently allocated memory is aligned according to alignof(T). If it's not, std::free the old memory and call std::aligned_alloc.
      • check if the the previously allocated memory is large enough. If it's not, use aligned_realloc (see Note below) to acquire more.

    It could look like this:

    // holds a non-typed buffer (actually a void*) that can be used to store any
    // types, according to user needs
    struct Buffer {
        // destructing functor storage
        std::function<void()> destructors = [] {};
        // buffer address
        void* p = nullptr;
        // number of stored elements
        size_t n = 0;
        // buffer size in bytes
        size_t s = 0;
        // allocates a buffer large enough for N objects of type T and
        // default-construct them calling it on a previously allocated buffer
        // without adequate call to Deallocate is **OK**.
        template <typename T>
        T* DefaultAllocate(const size_t N) {
            destructors(); // call destructors since it may already be occupied
            size_t RequiredSize = N * sizeof(T);
            n = N;
            T* tmp;
            if(!p || reinterpret_cast<std::uintptr_t>(p) % alignof(T) != 0) {
                // first allocation or incorrect alignment
                std::free(p);
                p = std::aligned_alloc(alignof(T), RequiredSize);
                if(p == nullptr) return nullptr;
                s = RequiredSize;
            } else if (s < RequiredSize) {
                // not enough room
                auto re = aligned_realloc(p, alignof(T), RequiredSize); // See note
                if(re == nullptr) return nullptr;
                p = re;
                s = RequiredSize;
            }
            // placement array default construction
            tmp = new (p) T[N];
    
            // create destructor functor
            destructors = [this] {
                T* tmp = reinterpret_cast<T*>(p);
                // Delete elements in reverse order of creation
                while (n > 0) {
                    tmp[--n].~T();
                }
            };
            return tmp;
        }
        // deallocate objects in buffer but not the buffer itself
        void Deallocate() { destructors(); }
    
        ~Buffer() {
            destructors();
            std::free(p);
        }
    };
    

    Demo - which uses std::realloc (which has undefined behavior since the memory wasn't allocated with std::malloc, std::calloc or std::realloc). It's a conceptual demo, not a plug-and-play solution.

    Note: aligned_realloc doesn't currently exist in the standard library and implementations differ when it comes to the support for doing aligned reallocation. MSVC has _aligned_realloc which requires _aligned_free and also _aligned_alloc instead of std::aligned_alloc. POSIX has posix_memalign.