Search code examples
c++serializationreinterpret-castmemory-layout

Is `reinterpret_cast` to go from derived -> memory -> base safe given the proper static assertions?


I'm writing C++14 code which needs to deserialize objects as fast as possible.

The serialization is in memory and occurs in the same process as the deserialization.

Consider this:

struct Base
{
    Base(const int a) :
        a {a}
    {
    }

    int a;
};

struct Derived : public Base
{
    Derived(const int a, const int b, const int c) :
        Base {a},
        b {b},
        c {c}
    {
    }

    int b;
    int c;
};

Assume all those objects will only contain int members.

What I'd like is for some memory region to contain contiguous memory representations of those objects, for example:

[Base] [Base] [Derived] [Base] [Derived] [Derived]

Let's just pretend you may know if it's Derived or not from the Base::a member.

Is the following code safe given the two static assertions?

#include <iostream>

struct Base
{
    Base(const int a) :
        a {a}
    {
    }

    int a;
};

struct Derived : public Base
{
    Derived(const int a, const int b, const int c) :
        Base {a},
        b {b},
        c {c}
    {
    }

    int b;
    int c;
};

static_assert(alignof(Derived) == alignof(Base),
              "`Base` and `Derived` share the same alignment");

static_assert(sizeof(Derived) == sizeof(Base) + 2 * sizeof(int),
              "Size of `Derived` is expected");

int main()
{
    const Derived d {1, 2, 3};
    const auto dP = reinterpret_cast<const int *>(&d);
    auto& base = reinterpret_cast<const Base&>(*dP);

    std::cout << base.a << std::endl;
}

Then the memory region would simply be an std::vector<int>, for example.

I know I'm assuming some compiler-specific stuff here and those two reinterpret_cast are probably unsafe and not portable, but I'd like to protect my code with good static assertions. Am I missing any?


Solution

  • Derived is not a standard layout class. The layout doesn't have the guarantees you expect. In particular, even if sizeof(Derived) == sizeof(Base) + sizeof(int[2]), the Base subobject could be after the members (or inbetween), not that any compiler actually does this.

    If this assertion passes:

    static_assert(offsetof(Derived, Base::a) == 0, "Base::a not at beginning of Derived");
    

    (And an implicit offsetof(Base, a) == 0, but this is guaranteed because std::is_standard_layout_v<Base>)

    Then you know your pointers will have the correct addresses (and the fact that (void*) &d.a, (void*) &d and (void*) &(Base&) d will all be equal)

    In C++14, this is enough and your reinterpret_casts will work as-is.


    In C++17, you would have to launder your pointers:

    const auto *dP = std::launder(reinterpret_cast<const int *>(&d));  // Points to d.Base::a
    auto& base = reinterpret_cast<const Base&>(*dP);  // pointer-interconvertible with first member of standard layout class
    
    // Or
    const auto *dP = reinterpret_cast<const int *>(&d);  // Points to d, not d.a, but has the same address
    auto& base = *std::launder(reinterpret_cast<const Base *>(dP));
    
    // Or
    const auto *dP = &d.a;
    auto& base = reinterpret_cast<const Base&>(*dP);
    

    So keep this in mind if you ever upgrade.