Search code examples
c++language-lawyerundefined-behaviorstrict-aliasingreinterpret-cast

Can you reinterpret_cast between types which have the same representation?


Suppose we have two types, that have the same representation (the same member variables and base classes, in the same order). Is it valid (i.e. not UB) to reinterpret_cast between them? E.g. is it valid to reinterpret_cast from Mary to Ashley&? And what if the two types are polymorphic?

struct Mary {
    int  m1;
    char m2;
};

struct Ashley {
    int  a1;
    char a2;
};

int TryTwins ()
{
    Mary mary = {};

    Ashley& ashley = reinterpret_cast<Ashley&> (mary);
    ashley.a1 = 1;
    ashley.a2 = 2;

    return mary.m1 + mary.m2;
}

What if we cast the beginning of an object to another type, if we know that the source type starts with the member variables of the target type? E.g. is this valid (i.e. not UB)?

struct Locomotive {
    int    engine;
    char   pantograph;
};

struct Train {
    int    engine;
    char   pantograph;
    int*   wagon1;
    int**  wagon2;
    int*** wagon3;
};

int TryTrain ()
{
    Train train = {};

    Locomotive& loc = reinterpret_cast<Locomotive&> (train);
    loc.engine     = 1;
    loc.pantograph = 2;

    return train.engine + train.pantograph;
}

Note that all major compilers treat these as a valid casts (live demo). The question is, whether the C++ language allows this.


Solution

  • [expr.reinterpret.cast]/11:

    A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_­cast. The result refers to the same object as the source glvalue, but with the specified type. [...]

    Mary and Ashley are object types, so pointers thereto can convert to each other. Now, we get use a lvalue of type Ashley to access the underlying Mary object.

    [basic.lval]/8:

    If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

    • the dynamic type of the object,

    • a cv-qualified version of the dynamic type of the object,

    • a type similar to the dynamic type of the object,

    • a type that is the signed or unsigned type corresponding to the dynamic type of the object,

    • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

    • an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

    • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

    • a char, unsigned char, or std​::​byte type.

    None of these covers the case in question. ("Similar" talks about cv-qualification.) Therefore, undefined behavior.