Search code examples
.netstackevaluationcil

.NET IL / MSIL Evaluation Stack fundamentals


Can't seem to find a good answer for these questions.

Here are what I think I know and what I'm fuzzy on.

  • An Evaluation Stack is a memory buffer like a C style stack (is it a stack of native int / size_t)?
  • Evaluation Stack elements can be either 32 or 64 bits (how are these mixed in a single stack?)
  • Ldloc_0 stores the local variable on the evaluation stack BUT how if its larger than 64bits?
  • Does Ldloc_0 just store ptrs to local variables on the evaluation stack?
  • Are objects stored on the evaluation stack always either pointers or primitive values?
  • If .maxsize is 8 does that mean (8 * size_t)? If so how if I read docs stating its either 32 or 64bit

Take the example below. Does this local variable get stored on the evaluation stack by a ptr reference?

public struct MyStruct
{
    public long x, y, z;

    public static MyStruct Foo()
    {
        MyStruct c;
        c.x = 1;
        c.y = 2;
        c.z = 3;
        return c;   
    }
}

"ldloc.0" clearly stored the struct onto the evaluation stack BUT its also much larger than 64bits. Is the reference stored instead?

.class public sequential ansi sealed beforefieldinit MyStruct
    extends [mscorlib]System.ValueType
{
    // Fields
    .field public int64 x
    .field public int64 y
    .field public int64 z

    // Methods
    .method public hidebysig static 
        valuetype MyStruct Foo () cil managed 
    {
        // Method begins at RVA 0x2050
        // Code size 34 (0x22)
        .maxstack 2
        .locals init (
            [0] valuetype MyStruct,
            [1] valuetype MyStruct
        )

        IL_0000: nop
        IL_0001: ldloca.s 0
        IL_0003: ldc.i4.1
        IL_0004: conv.i8
        IL_0005: stfld int64 MyStruct::x
        IL_000a: ldloca.s 0
        IL_000c: ldc.i4.2
        IL_000d: conv.i8
        IL_000e: stfld int64 MyStruct::y
        IL_0013: ldloca.s 0
        IL_0015: ldc.i4.3
        IL_0016: conv.i8
        IL_0017: stfld int64 MyStruct::z
        IL_001c: ldloc.0// What is actually stored here?
        IL_001d: stloc.1
        IL_001e: br.s IL_0020

        IL_0020: ldloc.1
        IL_0021: ret
    } // end of method MyStruct::Foo

} // end of class MyStruct

Solution

  • The stack's elements are not all of the same size, and can include value types (structs) of any size. From ECMA-335, section I.12.3.2.1:

    The evaluation stack is made up of slots that can hold any data type, including an unboxed instance of a value type.

    [...]

    While some JIT compilers might track the types on the stack in more detail, the CLI only requires that values be one of:

    • int64, an 8-byte signed integer
    • int32, a 4-byte signed integer
    • native int, a signed integer of either 4 or 8 bytes, whichever is more convenient for the target architecture
    • F, a floating point value (float32, float64, or other representation supported by the underlying hardware)
    • &, a managed pointer
    • O, an object reference
    • *, a “transient pointer,” which can be used only within the body of a single method, that points to a value known to be in unmanaged memory (see the CIL Instruction Set specification for more details. * types are generated internally within the CLI; they are not created by the user).
    • A user-defined value type

    A little earlier, in section I.12.1:

    User-defined value types can appear in memory locations or on the stack and have no size limitation

    So in your case the ldloc.0 instruction is loading the entirety of the value type instance - with its three data fields - onto the stack.

    Thanks to this answer for pointing me toward these ECMA sections. That and the other answers on that question indicate why the stack can be measured in slots rather than bytes: because the JIT compiler is already evaluating how to convert the MSIL into native instructions, so it has to know the types of the values on the stack at every instruction.