Search code examples
c#.netclrcilunboxing

Why does 'unbox.any' not provide a helpful exception text the way 'castclass' does?


To illustrate my question, consider these trivial examples (C#):

object reference = new StringBuilder();
object box = 42;
object unset = null;

// CASE ONE: bad reference conversions (CIL instrcution 0x74 'castclass')
try
{
  string s = (string)reference;
}
catch (InvalidCastException ice)
{
  Console.WriteLine(ice.Message); // Unable to cast object of type 'System.Text.StringBuilder' to type 'System.String'.
}
try
{
  string s = (string)box;
}
catch (InvalidCastException ice)
{
  Console.WriteLine(ice.Message); // Unable to cast object of type 'System.Int32' to type 'System.String'.
}

// CASE TWO: bad unboxing conversions (CIL instrcution 0xA5 'unbox.any')
try
{
  long l = (long)reference;
}
catch (InvalidCastException ice)
{
  Console.WriteLine(ice.Message); // Specified cast is not valid.
}
try
{
  long l = (long)box;
}
catch (InvalidCastException ice)
{
  Console.WriteLine(ice.Message); // Specified cast is not valid.
}
try
{
  long l = (long)unset;
}
catch (NullReferenceException nre)
{
  Console.WriteLine(nre.Message); // Object reference not set to an instance of an object.
}

So in the cases where we attempt a reference conversion (corresponding to CIL instruction castclass), the exception thrown contains an excellent message of the form:

Unable to cast object of type 'X' to type 'Y'.

Empirical evidence shows that this text message is often extremely helpful for the (experienced or inexperienced) developer (bug fixer) who needs to deal with the problem.

In contrast, the message we get when an attempted unboxing (unbox.any) fails, is rather non-informative. Is there any technical reason why this must be so?

Specified cast is not valid. [NOT HELPFUL]

In other words, why do we not receive a message like (my words):

Unable to unbox an object of type 'X' into a value of type 'Y'; the two types must agree.

respectively (my words again):

Unable to unbox a null reference into a value of the non-nullable type 'Y'.

So to repeat my question: Is it "accidental" that the error message in one case is good and informative, and in the other case is poor? Or is there a technical reason why it would be impossible, or prohibitively difficult, for the runtime to provide details of the actual types encountered in the second case?

(I have seen a couple of threads here on SO that I am sure would never have been asked if the exception text for failed unboxings had been better.)


Update: Daniel Frederico Lins Leite's answer led to him opening an issue on the CLR Github (see below). This was discovered to be a duplicate of an earlier issue (raised by Jon Skeet, people almost guessed it!). So there was no good reason for the poor exception message, and people had already fixed it in the CLR. So I was not the first to wonder about this. We can look forward to the day when this improvement ships in the .NET Framework.


Solution

  • TL;DR;

    I think that the runtime have all information needed to improve the message. Maybe some JIT developer could help, because it is needless to say that the JIT code is very sensitive and some times decisions are taken because of performance or security reasons, that are very difficult to an outsider to understand.

    Detailed Explanation

    To simplify the problem I changed the method to:

    C#

    void StringBuilderCast()
    {
        object sbuilder = new StringBuilder();
        string s = (string)sbuilder;
    }
    

    IL

    .method private hidebysig 
        instance void StringBuilderCast() cil managed 
    {
        // Method begins at RVA 0x214c
        // Code size 15 (0xf)
        .maxstack 1
        .locals init (
            [0] object sbuilder,
            [1] string s
        )
    
        IL_0000: nop
        IL_0001: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor()
        IL_0006: stloc.0
        IL_0007: ldloc.0
        IL_0008: castclass [mscorlib]System.String
        IL_000d: stloc.1
        IL_000e: ret
    } // end of method Program::StringBuilderCast
    

    The important opcodes here are:

    http://msdn.microsoft.com/library/system.reflection.emit.opcodes.newobj.aspx http://msdn.microsoft.com/library/system.reflection.emit.opcodes.castclass.aspx

    And the general memory layout is:

    Thread Stack                        Heap
    +---------------+          +---+---+----------+
    | some variable |    +---->| L | T |   DATA   |
    +---------------+    |     +---+---+----------+
    |   sbuilder2   |----+
    +---------------+
    
    T = Instance Type  
    L = Instance Lock  
    Data = Instance Data
    

    So in this case the runtime knows that it has a pointer to a StringBuilder and it should cast it to a string. In this situation it has all the information needed to give you the best exception as possible.

    If we see at the JIT https://github.com/dotnet/coreclr/blob/32f0f9721afb584b4a14d69135bea7ddc129f755/src/vm/interpreter.cpp#L6137 we will se something like that

    CORINFO_CLASS_HANDLE cls = GetTypeFromToken(m_ILCodePtr + 1, CORINFO_TOKENKIND_Casting  InterpTracingArg(RTK_CastClass));
    Object * pObj = OpStackGet<Object*>(idx);
    ObjIsInstanceOf(pObj, TypeHandle(cls), TRUE)) //ObjIsInstanceOf will throw if cast can't be done
    

    if we dig into this method

    https://github.com/dotnet/coreclr/blob/32f0f9721afb584b4a14d69135bea7ddc129f755/src/vm/eedbginterfaceimpl.cpp#L1633

    and the important part would be:

    BOOL fCast = FALSE;
    TypeHandle fromTypeHnd = obj->GetTypeHandle();
     if (fromTypeHnd.CanCastTo(toTypeHnd))
        {
            fCast = TRUE;
        }
    if (Nullable::IsNullableForType(toTypeHnd, obj->GetMethodTable()))
        {
            // allow an object of type T to be cast to Nullable<T> (they have the same representation)
            fCast = TRUE;
        }
        // If type implements ICastable interface we give it a chance to tell us if it can be casted 
        // to a given type.
        else if (toTypeHnd.IsInterface() && fromTypeHnd.GetMethodTable()->IsICastable())
        {
        ...
        }
     if (!fCast && throwCastException) 
        {
            COMPlusThrowInvalidCastException(&obj, toTypeHnd);
        } 
    

    The important part here is the method that throws the exception. As you can see it receives both the current object and the type that you trying to cast to.

    At the end, the Throw method calls this method:

    https://github.com/dotnet/coreclr/blob/32f0f9721afb584b4a14d69135bea7ddc129f755/src/vm/excep.cpp#L13997

    COMPlusThrow(kInvalidCastException, IDS_EE_CANNOTCAST, strCastFromName.GetUnicode(), strCastToName.GetUnicode());
    

    Wich gives you the nice exception message with the type names.

    But when you are casting a object to a value type

    C#

    void StringBuilderToLong()
    {
        object sbuilder = new StringBuilder();
        long s = (long)sbuilder;
    }
    

    IL

    .method private hidebysig 
        instance void StringBuilderToLong () cil managed 
    {
        // Method begins at RVA 0x2168
        // Code size 15 (0xf)
        .maxstack 1
        .locals init (
            [0] object sbuilder,
            [1] int64 s
        )
    
        IL_0000: nop
        IL_0001: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor()
        IL_0006: stloc.0
        IL_0007: ldloc.0
        IL_0008: unbox.any [mscorlib]System.Int64
        IL_000d: stloc.1
        IL_000e: ret
    }
    

    the important opcode here is:
    http://msdn.microsoft.com/library/system.reflection.emit.opcodes.unbox_any.aspx

    and we can see the UnboxAny behavior here https://github.com/dotnet/coreclr/blob/32f0f9721afb584b4a14d69135bea7ddc129f755/src/vm/interpreter.cpp#L8766

    //GET THE BOXED VALUE FROM THE STACK
    Object* obj = OpStackGet<Object*>(tos);
    
    //GET THE TARGET TYPE METADATA
    unsigned boxTypeTok = getU4LittleEndian(m_ILCodePtr + 1);
    boxTypeClsHnd = boxTypeResolvedTok.hClass;
    boxTypeAttribs = m_interpCeeInfo.getClassAttribs(boxTypeClsHnd);
    
    //IF THE TARGET TYPE IS A REFERENCE TYPE
    //NOTHING CHANGE FROM ABOVE
    if ((boxTypeAttribs & CORINFO_FLG_VALUECLASS) == 0)
    {
        !ObjIsInstanceOf(obj, TypeHandle(boxTypeClsHnd), TRUE)
    }
    //ELSE THE TARGET TYPE IS A REFERENCE TYPE
    else
    {
        unboxHelper = m_interpCeeInfo.getUnBoxHelper(boxTypeClsHnd);
        switch (unboxHelper)
            {
            case CORINFO_HELP_UNBOX:
                    MethodTable* pMT1 = (MethodTable*)boxTypeClsHnd;
                    MethodTable* pMT2 = obj->GetMethodTable();
    
                    if (pMT1->IsEquivalentTo(pMT2))
                    {
                        res = OpStackGet<Object*>(tos)->UnBox();
                    }
                    else
                    {
                        CorElementType type1 = pMT1->GetInternalCorElementType();
                        CorElementType type2 = pMT2->GetInternalCorElementType();
    
                        // we allow enums and their primtive type to be interchangable
                        if (type1 == type2)
                        {
                              res = OpStackGet<Object*>(tos)->UnBox();
                        }
                    }
    
            //THE RUNTIME DOES NOT KNOW HOW TO UNBOX THIS ITEM
                    if (res == NULL)
                    {
                        COMPlusThrow(kInvalidCastException);
    
                        //I INSERTED THIS COMMENTS
                //auto thCastFrom = obj->GetTypeHandle();
                //auto thCastTo = TypeHandle(boxTypeClsHnd);
                //RealCOMPlusThrowInvalidCastException(thCastFrom, thCastTo);
                    }
                    break;
            case CORINFO_HELP_UNBOX_NULLABLE:
                    InterpreterType it = InterpreterType(&m_interpCeeInfo, boxTypeClsHnd);
                    size_t sz = it.Size(&m_interpCeeInfo);
                    if (sz > sizeof(INT64))
                    {
                        void* destPtr = LargeStructOperandStackPush(sz);
                        if (!Nullable::UnBox(destPtr, ObjectToOBJECTREF(obj), (MethodTable*)boxTypeClsHnd))
                        {
                            COMPlusThrow(kInvalidCastException);
                        //I INSERTED THIS COMMENTS
                //auto thCastFrom = obj->GetTypeHandle();
                //auto thCastTo = TypeHandle(boxTypeClsHnd);
                //RealCOMPlusThrowInvalidCastException(thCastFrom, thCastTo);
                        }
                    }
                    else
                    {
                        INT64 dest = 0;
                        if (!Nullable::UnBox(&dest, ObjectToOBJECTREF(obj), (MethodTable*)boxTypeClsHnd))
                        {
                            COMPlusThrow(kInvalidCastException);
                        //I INSERTED THIS COMMENTS
                //auto thCastFrom = obj->GetTypeHandle();
                //auto thCastTo = TypeHandle(boxTypeClsHnd);
                //RealCOMPlusThrowInvalidCastException(thCastFrom, thCastTo);
                        }
                    }
                }
                break;
            }
    }
    

    Well... at least, it seems possible to give a better exception message. If you remember when the exception had a nice message the call was:

    COMPlusThrow(kInvalidCastException, IDS_EE_CANNOTCAST, strCastFromName.GetUnicode(), strCastToName.GetUnicode());
    

    and the less inforative message it was:

    COMPlusThrow(kInvalidCastException);
    

    So I think that it is possible to improve the message doing

    auto thCastFrom = obj->GetTypeHandle();
    auto thCastTo = TypeHandle(boxTypeClsHnd);
    RealCOMPlusThrowInvalidCastException(thCastFrom, thCastTo);
    

    I have created the following issue on the coreclr github to see what is Microsoft developers opinions.

    https://github.com/dotnet/coreclr/issues/7655