Search code examples
c#.netreflection.emitilbounds-check-elimination

Array bounds check in DynamicAssembly only works when evaluation stack is empty


I've got simple for loop with array access written using ILGenerator. When method is created with this exact code, I open disassembly and it's ok, no array bounds check.

But when I first put instance of other class on evaluation stack, then run for loop, it does array bounds check. I'm running on release.

Any idea why? I've already read blog post about array bound checks: http://blogs.msdn.com/b/clrcodegeneration/archive/2009/08/13/array-bounds-check-elimination-in-the-clr.aspx

        // Uncomment this to enable bound checks, type of arg0 is some my class
        //il.Emit(OpCodes.Ldarg_0);

        var startLbl = il.DefineLabel();
        var testLbl = il.DefineLabel();
        var index = il.DeclareLocal(typeof(Int32));
        var arr = il.DeclareLocal(typeof(Int32).MakeArrayType());

        // arr = new int[4];
        il.Emit(OpCodes.Ldc_I4_4);
        il.Emit(OpCodes.Newarr, typeof(Int32));
        il.Emit(OpCodes.Stloc, arr);

        // Index = 0
        il.Emit(OpCodes.Ldc_I4_0); // Push index
        il.Emit(OpCodes.Stloc, index); // Pop index, store

        il.Emit(OpCodes.Br_S, testLbl); // Go to test

        // Begin for
        il.MarkLabel(startLbl);

        // Load array, index
        il.Emit(OpCodes.Ldloc, arr);
        il.Emit(OpCodes.Ldloc, index);

        // Now on stack: array, index
        // Load element
        il.Emit(OpCodes.Ldelem_I4);
        // Nothing here now, later some function call
        il.Emit(OpCodes.Pop);

        // Index++
        il.Emit(OpCodes.Ldloc, index);
        il.Emit(OpCodes.Ldc_I4_1);
        il.Emit(OpCodes.Add);
        il.Emit(OpCodes.Stloc, index);

        il.MarkLabel(testLbl);
        // Load index, count, test for end
        il.Emit(OpCodes.Ldloc, index);
        il.Emit(OpCodes.Ldloc, arr);
        il.Emit(OpCodes.Ldlen); // Push len
        il.Emit(OpCodes.Conv_I4); // Push len
        il.Emit(OpCodes.Blt_S, startLbl);
        // End for

        // Remove instance added on top
        //il.Emit(OpCodes.Pop);

As I generate IL code, is better to keep instances of classes on evaluation stack or in local variables?

E.g. I get instance, go through fields, for each field do anything and than return. I've just kept instance on stack and called Emit(OpCodes.Dup) before reading next field. But it seems wrong (at least for case mentioned above).

Any articles/blog posts about generating (efficient/well-formed) IL code appreciated.


Solution

  • In general using locals will typically result in more readable code that is easier to debug, which given IL is already not something most developers are used to reading is important. There's even a chance the JIT will eliminate any performance penalty there might be for doing it.

    From what I've seen poking around in ILSpy, csc prefers locals too, although I have to admit when I've looked at IL rather than decompiling to C#, it's mostly been debug code. Since the JIT is probably written with the expectation that it'll mostly be running over the output of Microsoft's compilers, it wouldn't be a surprise if it didn't recognize looping constructs that didn't match what their compilers would emit. It's very plausible that the extra stack entry is foiling the JIT's ability to recognize that it can eliminate the bounds check.