Search code examples
c#.netclrcilboxing

Why Int32.ToString() emit call instruction instead of callvirt?


For the following code snippet:

struct Test
{
    public override string ToString()
    {
        return "";
    }
}

public class Program
{
    public static void Main()
    {
        Test a = new Test();
        a.ToString();
        Int32 b = 5;
        b.ToString();
    }
}

Compiler emits the following IL:

  .locals init ([0] valuetype ConsoleApplication2.Test a,
           [1] int32 b)
  IL_0000:  nop
  IL_0001:  ldloca.s   a
  IL_0003:  initobj    ConsoleApplication2.Test
  IL_0009:  ldloca.s   a
  IL_000b:  constrained. ConsoleApplication2.Test
  IL_0011:  callvirt   instance string [mscorlib]System.Object::ToString()
  IL_0016:  pop
  IL_0017:  ldc.i4.5
  IL_0018:  stloc.1
  IL_0019:  ldloca.s   b
  IL_001b:  call       instance string [mscorlib]System.Int32::ToString()
  IL_0020:  pop
  IL_0021:  ret

Since both value type Test and Int32 override the ToString() method, I think no boxing will occur in both a.ToString() and b.ToString(). Thus I wonder why compiler emits constraned+callvirt for Test, and call for Int32?


Solution

  • This is an optimization done by the compiler for primitive types.

    But even for custom structs, callvirt will actually be executed as call at runtime due to the constrained. opcode - in the case where the method was overridden. It allows the compiler to emit the same instructions in either case and let the runtime handle it.

    From MSDN:

    If thisType is a value type and thisType implements method then ptr is passed unmodified as the this pointer to a call method instruction, for the implementation of method by thisType.

    And:

    The constrained opcode allows IL compilers to make a call to a virtual function in a uniform way independent of whether ptr is a value type or a reference type. Although it is intended for the case where thisType is a generic type variable, the constrained prefix also works for nongeneric types and can reduce the complexity of generating virtual calls in languages that hide the distinction between value types and reference types.

    I don't know of any official documentation for the optimization, but you can see the remarks in the Roslyn repo for the MayUseCallForStructMethod method.

    As to why this optimization is deferred to the runtime for non-primitive types, I believe it's because the implementation can change. Imagine referencing a library that originally had an override for ToString, then changing the DLL (without recompiling!) to one where the override is removed. This would've caused a runtime exception. For primitives they can be sure it won't happen.