What happens internally when we do downcasting?

I was trying to understand down-casting... Here is what I have tried...

class Shape
{
public:
    Shape() {}
    virtual ~Shape() {}
    virtual void draw(void)     { cout << "Shape: Draw Method" << endl; }
};

class Circle : public Shape
{
public:
    Circle(){}
    ~Circle(){}
    void draw(void)     { cout << "Circle: Draw Method" << endl; }
    void display(void)  { cout << "Circle: Only CIRCLE has this" << endl; }
};

int main(void)
{
    Shape newShape;
    Circle *ptrCircle1 = (Circle *)&newShape;
    ptrCircle1->draw();
    ptrCircle1->display();

    return EXIT_SUCCESS;
}

Here I have down the down-casting by converting assigning the base class pointer to derived class. What I understood is...

Circle* ptrCircle1 -->  +------+ new Shape()
                        |draw()|
                        +------+

The base class has no information about the display() method which is there in the derived call. I was expecting a crash, but it did print the output as

Shape: Draw Method
Circle: Only CIRCLE has this

Can someone explain what happens internally.

Thanks...

Solution

The C-style cast, in this case and due to the inheritance relationship, is equivalent to static_cast. As with most casts (with the exception of dynamic_cast, where some checks are injected), when you tell it that the object is really a Circle, the compiler will trust you and assume that it is. The behavior is undefined in this case, as the object is not a Circle, you are lying to the compiler and all bets are off.

What really happens here is that the compiler figures whether there is an offset from the base to the derived type for this combination and adjust the pointer accordingly. At this point you get a pointer to the derived type that has the adjusted address, and type safety is off the window. Any access through that pointer will assume that the memory in that location is what you told it and will interpret it as such, which is undefined behavior, as you are reading memory as if it was of a type that it is not.

When is the pointer adjusted?

struct base1 { int x; };
struct base2 { int y; };
struct derived : base1, base2 {};
base2 *p = new derived;

The address of derived, base1 and base1::x is the same, but different from the address of base2 and base2::y. If you were casting from derived to base2 the compiler would adjust the pointer in the conversion (adding sizeof(base1) to the address), when casting from base2 to derived, the compiler would adjust in the opposite direction.

Why do you get the results you get?

Shape: Draw Method

Circle: Only CIRCLE has this

This is related to how dynamic dispatch is implemented by the compiler. For each type with at least one virtual function the compiler will generate one (or more) virtual tables. The virtual table contains pointers to the final overrider for each function in the type. Every object holds a pointer(s) to the virtual table(s) for the complete type. Calling a virtual function involves the compiler doing a lookup in the table and following the pointer.

In this case the object is really a Shape, the vptr will refer to the virtual table for Shape. When you cast from Shape to Derived you tell the compiler that this is a Circle (even if it is not). When you call draw() the compiler follows the vptr (in this case the vptr for the Shape subobject and the Circle subobject happen to be in the same offset (0 in most ABIs) from the beginning of the object. The call injected by the compiler follows the Shape vptr (the cast does not change any contents of the memory, that vptr is still that of Shape) and hit Shape::draw.

In the case of display() the call is not dynamically dispatched through the vptr as it is not a virtual function. That means that the compiler will inject a direct call to Circle::draw() passing the address that you have as the this pointer. You can simulate this for a virtual function by disabling dynamic dispatch:

ptrCircle1->Circle::draw();

Remember that this is just an explanation of compiler details that escape the C++ standard, by the standard this is just Undefined Behavior, whatever the compiler does is fine. A different compiler could do something different (although all ABIs I have seen do basically the same here).

If you are really interested in the details of how these things work, you can take a look at Inside the C++ object model by Lippman. It is a somehow old book, but it addresses the problems the compiler must solve and some of the solutions that compilers have used.