Consider the following C# struct definitions:
public struct A
{
public B B;
}
public struct B
{
public int C;
}
Also consider the following static method:
public static int Method(A a) => a.B.C;
Calling this method will result in a copy of the struct type A
. For example, in the following code:
A a = default;
Method(a);
the call to Method
will compile to IL that looks something like this:
IL_0008: ldloc.0 // V_0
IL_0009: call int32 Class::Method(valuetype A)
ldloc
will copy the value of local variable a
(V_0
) onto the evaluation stack, and that value will be used in Method
. If A
(or B
) was a large struct, this copy could supposedly be expensive. The IL for Method
also results in load-value instructions:
IL_0000: ldarg.0 // a
IL_0001: ldfld valuetype B A::B
IL_0006: ldfld int32 B::C
IL_000b: ret
Recent versions of C# include features that can help make working with structs more efficient. C# 7.2 introduced the in
modifier on parameters that enables the passing of a value type by reference when the compiler can verify that the argument will not be modified by the called method. For example, applying the in
modifier to parameter a
:
public static int Method(in A a) => a.B.C;
will result in the following compiled IL at the call site:
IL_0008: ldloca.s a
IL_000a: call int32 Class::Method(valuetype A&)
and in the implementation of Method
:
IL_0000: ldarg.0 // a
IL_0001: ldflda valuetype B A::B
IL_0006: ldfld int32 B::C
IL_000b: ret
Note the load-address instructions. My assumption (please correct me if I am wrong) is that for deep field reads (such as reading C
that's inside of B
that's inside of A
), load-address instructions are more efficient than load-value instructions.
With that in mind, consider changing the example code:
A a = default;
var c = a.B.C;
The second line then compiles to:
IL_0008: ldloc.1 // V_1
IL_0009: ldfld valuetype B A::B
IL_000e: ldfld int32 B::C
IL_0013: stloc.0 // c
Why wouldn't the compiler prefer to use load-address instructions in this case too? Is there an efficiency difference simply because a
is a local variable versus a method parameter, or is there something else I'm missing here?
It's definitely not related to a
being a local variable vs a method argument. Not from efficiency point of view, at least.
The first thing to understand is that structs in C# sit (in the memory) directly where they are declared - so directly on the stack, for local variables. More importantly - nested structs behave the same. It is possible for the JIT, in any point during runtime (not always during compilation, read more about StructLayoutAttribute) , to know exactly where B
is inside of A
, where C
is inside of B
, and where B.C
lies inside of a
.
When looking at the assembly code after the JIT compiles the method (it's important to compile in Release - debug builds will not get optimized the same way. Make sure the compiler doesn't optimize the variables away as well), you'll see that no matter where you type a.B.C it will always be a direct assignment from memory (in relation to where A stands in memory).
In my case, I added another variable int a1
inside A to move the memory a bit - this is the resulting code:
A a = default;
xor ecx,ecx
mov qword ptr [rbp-30h],rcx
var c = a.B.C;
mov esi,dword ptr [rbp-2Ch]
where esi is a temporary register for var c
and [rbp-30h]
is where a
sits in the stack. B
has an integer sitting in offset 0, A
has an integer sitting in offset 0 and B
sitting in offset 4, so the final address of a.B.C
is always a+4 ([rbp-2Ch]
).