Search code examples
c#.netcastingtype-conversionunboxing

Why are casting and conversion operations are syntactically indistinguishable?


Stack Overflow has several questions about casting boxed values: 1, 2.

The solution requires first to unbox the value and only after that cast it to another type. Nevertheless, boxed value "knows" its own type, and I see no reason why conversion operator could not be called.

Moreover, the same issue is valid for reference types:

void Main()
{
    object obj = new A();
    B b = (B)obj;   
}


public class A 
{
}

public class B {}

This code throws InvalidCastException. So it's not the matter of value vs reference type; it's how compiler behaves.

For the upper code it emits castclass B, and for the code

void Main()
{
    A obj = new A();
    B b = (B)obj;   
}

public class A 
{
    public static explicit operator B(A obj)
    {
        return new B();
    }
}

public class B
{
}

it emits call A.op_Explicit.

Aha! Here compiler sees that A has an operator and calls it. But what then happens if B inherits from A? Not so fast, compiler is quite clever...it just says:

A.explicit operator B(A)': user-defined conversions to or from a derived class are not allowed

Ha! No ambiguity!

but why on Earth did they allow two rather different operations to look the same?! What was the reason?


Solution

  • Your observation is, as far as I can tell, the observation that I made here:

    http://ericlippert.com/2009/03/03/representation-and-identity/

    There are two basic usages of the cast operator in C#:

    (1) My code has an expression of type B, but I happen to have more information than the compiler does. I claim to know for certain that at runtime, this object of type B will actually always be of derived type D. I will inform the compiler of this claim by inserting a cast to D on the expression. Since the compiler probably cannot verify my claim, the compiler might ensure its veracity by inserting a run-time check at the point where I make the claim. If my claim turns out to be inaccurate, the CLR will throw an exception.

    (2) I have an expression of some type T which I know for certain is not of type U. However, I have a well-known way of associating some or all values of T with an “equivalent” value of U. I will instruct the compiler to generate code that implements this operation by inserting a cast to U. (And if at runtime there turns out to be no equivalent value of U for the particular T I’ve got, again we throw an exception.)

    The attentive reader will have noticed that these are opposites. A neat trick, to have an operator which means two contradictory things, don’t you think?

    So apparently you are one of the "attentive readers" I called out who have noticed that we have one operation that logically means two rather different things. This is a good observation!

    Your question is "why is that the case?" This is not a good question! :-)

    As I have noted many times on this site, I cannot answer "why" questions satisfactorily. "Because that's what the specification says" is a correct answer but unsatisfactory. Really what the questioner is usually looking for is a summary of the language design process.

    When the C# language design team designs features the debates can go on for literally months, they can involve a dozen people discussing many different proposals each with their own pros and cons, that generate hundreds of pages of notes. Even if I had the relevant information from the late 1990s meetings about cast operations, which I don't, it seems hard to summarize it concisely in a manner that would be satisfactory to the original questioner.

    Moreover, in order to satisfactorily answer this question one would of course have to discuss the entire historical perspective. C# was designed to be immediately productive for existing C, C++ and Java programmers, and so it borrows many of the conventions of these languages, including its basic mechanisms for conversion operators. In order to properly answer the question we would have to discuss the history of the cast operator in C, C++ and Java as well. This seems like far too much information to expect in an answer on StackOverflow.

    Frankly, the most likely explanation is that this decision was not the result of long debate between the merits of different positions. Rather, it's likely the language design team considered how it is done in C, C++ and Java, made a reasonable compromise position that didn't look too terrible, and moved on to other more interesting business. A proper answer would therefore be almost entirely historical; why did Ritchie design the cast operator like he did for C? I don't know, and we can't ask him.

    My advice to you is that you stop asking "why?" questions about the history of programming language design and instead ask specific technical questions about specific code, questions that have a short answers.