Search code examples
c++multiple-inheritanceundefined-behaviorvirtual-functionsreinterpret-cast

Understanding Virtual Functions when Deriving from Multiple Classes


I have started to understand the working of virtual functions in C++ and have come across the following code. Here is my understanding about virtual functions:

  1. Every class that has defines a virtual function, has a vtable created for it.
  2. When a instance of a class is created, vptr is created, which points to the vtable of that class.

Based on my understanding, I am trying to analyse the output of the following code, which i am not able to decipher how the code prints "C12g".

class I1 {
  public: 
    virtual void f(){cout << "I1" << endl;}
};
class I2 {
  public: 
    virtual void g(){cout << "I2" << endl;}
};
class C12 : public I1, public I2 {
  public:
    virtual void f(){cout << "C12f" << endl;}
    virtual void g(){cout << "C12g" << endl;}
};
int main(int argc, char *argv[]) {
  I2 *o = new C12();
  ((I1*)o)->f();
}

I thought that, since the C12 object is assigned to type I2, object o can access only its method g() in C12 (since g is overridden). Now, since o is type-cast to an I1, I thought f() in C12 will get called.

Actual output: C12g

I would like to know about the following things:

  1. Memory layout of C12 and what I1, I2 points to
  2. How C12g gets printed as output.
  3. What happens when an object is type-cast between two unrelated interfaces?

Solution

  • What you must first understand here is that the actual object created, *o is of type C12 - because that's what you constructed with new C12().

    Next, with virtual functions, then the member for the actual object will be called, no matter what 'type' you cast a pointer to. So, when you cast to an I2 pointer in I2 *o = new C12(), it doesn't matter to the underlying object if, for example, you then call o->g(), as the object would 'know' to call its overridden function.

    However, when you the cast the pointer to the 'unrelated' I1* you get into strange ground. Bearing in mind that the classes I1 and I2 have, essentially, identical memory layouts, then calling f() in one would be directing to the same 'offset' as calling g() in the other. But, as o is actually a pointer to I2, the v-table entry the call ends up with is the offset of g in I2 - which is overridden by C12.

    It's also noteworthy that you've used a C-style cast to get from I2* to I1* (but you could also use a reinterpret_cast). This is important, because both of these do absolutely nothing to the pointer, or to the object/memory pointed to.

    Probably sounds a bit garbled, but I hope it offers some insight!

    Here's a possible memory layout/scenario - but it's going to be implementation-specific, and using the class pointer after the C-style cast may well constitute undefined behaviour!

    Possible memory map (simplified, assuming 4-bytes for all components):

    class I1:
    0x0000: (non-virtual data for class I1)
    0x0004: v-table entry for function "f"
    
    class I2:
    0x0000: (non-virtual data for class I2)
    0x0004: v-table entry for function "g"
    
    class C12:
    0x0000: (non-virtual data for class I1)
    0x0004: v-table entry for function "f"
    0x0008: (non-virtual data for class I2)
    0x000C: v-table entry for function "g"
    0x0010: (class-specific stuff for C12)
    

    Now, when you do the conversion from a C12* to I2* in I2 *o = new C12();, the compiler understands the relation between the two classes, so o will point to the 0x0008 offset in C12 (the derived class has been correctly 'sliced'). But the C-style cast from I2* to I1* doesn't change anything, so the compiler 'thinks' it points to an I1 but it's still pointing to an actual I2 slice of C12 - and this 'looks' just like a real I1 class.

    Homework Assignment

    What you may find interesting (and may or may not concur with the memory layout I've described) is to add the following code towards the end of main():

    C12* properC12 = new C12();// Points to the 'origin' of the class
    I1* properI1 = properC12; // Should (?) have same value as above?
    I2* properI2 = properC12; // Should (?) have an offset to 'slice'
    I1* dodgyI1 = (I1*)properC12; // Will (?) have same value as properI2!
    cout << std::hex << properC12 << endl;
    cout << std::hex << properI1 << endl;
    cout << std::hex << properI2 << endl;
    cout << std::hex << dodgyI1 << endl;
    

    Please - anyone who tries - let us know what the values are, and what platform/compiler you are using. In Visual Studio 2019, compiling for the x64 platform, I get these pointer values:

    000002688A9726E0
    000002688A9726E0
    000002688A9726E8
    000002688A9726E0
    

    ... which (sort of) concurs with the memory layout I described (other than having the v-tables somewhere else, rather than 'in-block').