Upcast/Downcast and serialization

Just playing around with casting. Assume, we have 2 classes

public class Base
{
    public int a;
}

public class Inh : Base
{
    public int b;
}

Instantiate both of them

        Base b1 = new Base {a = 1};
        Inh i1 = new Inh {a = 2, b = 2};

Now, lets try upcast

        // Upcast
        Base b2 = i1;

Seems that b2 is still holding field b, which is presented only in Inh class. Lets check it by downcasting.

        // Downcast
        var b3 = b2;
        var i2 = b2 as Inh;
        var i3 = b3 as Inh;

        bool check = (i2 == i3);

Check is true here (i guess, because i2 and i3 are referencing to the same instance i1). Ok, lets see, how they would be stored in array.

        var list = new List<Base>();

        list.Add(new Base {a = 5});
        list.Add(new Inh {a = 10, b = 5});

        int sum = 0;
        foreach (var item in list)
        {
            sum += item.a;
        }

Everything is okay, as sum is 15. But when i'm trying to serialize array by using XmlSerializer (just to see, what's inside), it returns InvalidOperationException "The type ConsoleApplication1.Inh was not expected". Well, fair enough, because its array of Bases.

So, what actually b2 is? Can i serialize an array of Bases and Inhs? Can i get Inhs fields by downcasting items from deserialized array?

Solution

Actually, the question is about what happens in memory

So; not serialization, then. K.

Let's take it from the top, then:

public class Base
{
    public int a;
}

public class Inh : Base
{
    public int b;
}

Here we have two reference types (classes); the fact that they are reference-type is very important, because that directly influences what is actually stored in arrays / variables.

Base b1 = new Base {a = 1};
Inh i1 = new Inh {a = 2, b = 2};

Here we create 2 objects; one of type Base, and one of type Inh. The reference to each object is stored in b1 / i1 respectively. I've italicized the word reference for a reason: it is not the object that is there. The object is somewhere arbitrary on the managed heap. Essentially b1 and i1 are just holding the memory address to the actual object. Side note: there are minor technical differences between "reference", "address" and "pointer", but they serve the same purpose here.

Base b2 = i1;

This copies the reference, and assigns that reference to b2. Note that we haven't copied the object. We still only have 2 objects. All we have copied is the number that happens to represent a memory address.

var b3 = b2;
var i2 = b2 as Inh;
var i3 = b3 as Inh;
bool check = (i2 == i3);

Here we do the same thing in reverse.

var list = new List<Base>();

list.Add(new Base {a = 5});
list.Add(new Inh {a = 10, b = 5});

int sum = 0;
foreach (var item in list)
{
    sum += item.a;
}

The list here is a list of references. The objects are still somewhere arbitrary on the managed heap. So yes, we can iterate through them. Because all Inh are also Base, there is no issue whatsoever here. So finally, we get to the question (from comments(:

Then, another question (more detailed): how Inh would be stored in array of Bases? Would b be just dropped?

Absolutely not. Because they are reference-types, the list never actually contains and Inh or Base objects - it only contains the reference. The reference is just a number - 120934813940 for example. A memory address, basically. It doesn't matter at all whether we think 120934813940 points to a Base or an Inh - our talking about it in either terms doesn't impact the actual object located at 120934813940. All we need to do is perform a cast, which means: instead of thinking of 120934813940 as a Base, think of it as an Inh - which involves a type-test to confirm that it is what we suspect. For example:

int sum = 0;
foreach (var item in list)
{
    sum += item.a;
    if(item is Inh)
    {
       Inh inh = (Inh)item;
       Console.WriteLine(inh.b);
    }
}

So b was there all the time! The only reason we couldn't see it is that we only assumed that item was a Base. To get access to b we need to cast the value. There are three important operations commonly used here:

obj is Foo - performs a type test returning true if the value is non-null and is trivially assignable as that type, else false
obj as Foo - performs a type test, returning the reference typed as Foo if it is non-null and is a match, or null otherwise
(Foo)obj - performs a type test, returning null if it is null, the reference typed as Foo if it is a match, or throws an exception otherwise

So that loop could also be written as:

int sum = 0;
foreach (var item in list)
{
    sum += item.a;
    Inh inh = item as Inh;
    if(inh != null)
    {
       Console.WriteLine(inh.b);
    }
}