Search code examples
c++dynamic-castreinterpret-cast

Why does reinterpret_cast work in some cases but not others?


I just started working with a team that is using reinterpret_cast when clearly it should be dynamic_cast. Although they are using reinterpret_cast the code still seems to work just fine so I decided to leave it alone until recently when it final stopped working.

struct Base {
 virtual void do_work() = 0;
};
struct D1 : public Base {
 virtual void do_work();
 std::vector<int> i;
};

struct D2: public D1 { 
 void do_work()
};

struct Holds_data {
    std::vector<int> i;
};
struct Use_data : public Holds_data {
 virtual void do_work();
};


struct A : public Use_data, public Base {
    void do_work();
};

//case 1
// this code works
Base* working = new D2();
D2*   d2_inst = reinterpret_cast<D2*>(working);


//case 2
Base* fail = new A();
A*    A_inst  = reinterpret_cast<A*>(fail); // fails
A*    A_inst  = dynamic_cast<A*>(fail);     // works

in case 1 there does not seem to be a problem reinterpret cast SEEMS to work just fine. in case 2 i noticed the internal data of std::vector seems to be corrupted when using reinterpret cast

My question is why does case 1 pass? Shouldn't there be data corruption within the std::vector?


Solution

  • Short Answer

    The thing is that Base* working = new D2(); is implicitly casting D2* to Base* (static_cast).

    So if you have:

    D2* d2 = new D2();
    Base* b = d2;
    

    You can't be sure that std::addressof(d2) == std::addressof(b) will be true. But reinterpret_cast is only working if std::addressof(d2) == std::addressof(b) is true. So that your code is running correctly is like in the comments mentioned just a fortunate coincidence.


    More Detailed

    The memory layout for class D2 could look like:

    class D2:
    0x0000 Attributes of Base
    ...
    0x0010 Attirbutes of D1
    ...
    0x0020 Attributes of D2
    ...
    

    Base* b = new D2() will save the address of Base (0x0000). Since the attributes of the base classes are always stored before the attributes of the child class, the address stored in b (0x0000) is the same as the address returned by new D2() (0x0000) and reinterpret_cast will work.

    But on the other hand the memory layout for class A could look like:

    class A:
    0x0000 Attributes of HoldData
    ...
    0x0010 Attributes of UserData
    ...
    0x0020 Attributes of Base
    ...
    0x0030 Attributes of A
    ...
    

    Here the compiler has to store the data of either UserData or Base first. So if UserData gets stored first (like in the example), Base* b = new A() will also save the address of Base (0x0020), but since Base isn't the first stored class in A, the address returned by new A() (0x0000) does not equals the address saved in b (0x0020), since new A() (0x0000) was implicitly statically casted to Base*. This means reinterpret_cast will fail here.

    That's why case1 is working and case2 doesn't.


    One last thing: You should never trust the compiler that it always uses the same memory layout. There are many things concerning the memory layout what is not defined in the standard. Using reinterpret_cast here most likely is undefined behaviour!