Search code examples
c++castingmultiple-inheritancevirtual-functions

XY inherits from both X and Y. casting XY* to X* then to Y* then calling Y's function results in calling X's function


#include <iostream>

struct X
{
    virtual void x() = 0;
};

struct Y
{
    virtual void y() = 0;
};

struct XY : X, Y
{
    void x() override { std::cout << "X\n"; }
    void y() override { std::cout << "Y\n"; }
};

int main()
{
    XY xy;

    X* xptr = &xy;

    Y* yptr = (Y*)xptr;

    yptr->y(); //prints "X"....

    ((Y*)((X*)(&xy)))->y(); // prints "Y"....
}

Output:

X
Y

Can someone explain in some detail why this is happening? Why the first call is printing X and also why the two calls are different from each other?


Solution

  • As mentioned in the comments, as far as the language is concerned, this is Undefined Behavior.

    However, the actual chosen behavior does reveal how the innards of a typical C++ compiler works, so it can still be interesting to investigate why you got the output you did. That being said, It's important to remember that the following explanation is not universal. There are no hard requirement for things to work this way, and any code relying on things behaving like that is effectively broken, even if it works on all compilers you try it on.

    C++ polymorphism is typically implemented using a vtable, which is basically a list of function pointers, and can be seen as a hidden member pointer in the object.

    so

    struct X
    {
        virtual void x() = 0;
    };
    
    struct Y {
        virtual void y() = 0;
    };
    

    Is roughly equivalent to (it doesn't actually use std::function<>, but this makes the pseudo code more legible):

    struct X {
        struct vtable_t {
          std::function<void(void*)> first_virtual_function;
        };
        
        vtable_t* vtable;
    
        void x() {
          vtable->first_virtual_function(this);
        }
    };
    
    struct Y {
        struct vtable_t {
          std::function<void(void*)> first_virtual_function;
        };
        
        vtable_t* vtable;
    
        void y() {
          vtable->first_virtual_function(this);
        }
    };
    

    Notice how X::vtable_t and Y::vtable_t are coincidentally essentially the same thing. If X and Y had different virtual functions, things would not line up this neatly.

    Another important piece of the puzzle is that multiple inheritance is effectively a concatenation:

    struct XY : X, Y {
        void x() override { std::cout << "X\n"; }
        void y() override { std::cout << "Y\n"; }
    };
    
    // is roughly equivalent to:
    struct XY {
      static X::vtable vtable_for_x; // with first_virtual_function assigned to XY::x()
      static Y::vtable vtable_for_y; // with first_virtual_function assigned to XY::y()
    
      X x_base;
      Y y_base;
    
      XY() {
        x_base.v_table = &vtable_for_x;
        y_base.v_table = &vtable_for_y;
      }
    
      void x() { std::cout << "X\n"; }
      void y() { std::cout << "Y\n"; }
    };
    

    Which implies that casting from a multiple-inherited type to a base is not just a matter of changing the type of the pointer, the value has to change as well.

    Only the X pointer is equivalent to the base object pointer, the Y pointer is actually a different address.

    X* xptr = &xy;  
    // is equivalent to
    X* xptr = &xy->x_base;
    
    Y* xptr = &xy;  
    // is equivalent to
    Y* xptr = &xy->y_base;
    

    Finally, when you cast from X to Y, since these types are unrelated, the operation is a reinterpret_cast, so while the pointer might be a pointer to Y, the underlying object is still an X.

    Luckily for you, things line up:

    • Both X and Y have the vtable pointer as the first member object.
    • Both X and Y's vtable are effectively equivalent, the former pointing to XY::x(), the later to XY::y().

    So when the logic of invoking y() is applied to an object of type X, the bits just happen to line up to invoke XY::x() instead.