Search code examples
c#.netassemblycompiler-optimizationroslyn

why is cmp + je in every method in C# JIT assembly code in Debug build


When you put a simple class.

public sealed class C {
    public static void M() {
    }
}

into https://sharplab.io/

it translates to (with annotations from me): (source)

C.M()
    L0000: push ebp      /////////////////////
    L0001: mov ebp, esp  // function frame initialization
    L0003: push edi      /////////////////////
    L0004: cmp dword ptr [0x281dc19c], 0 // if (0 == ???)
    L000b: je short L0012  // then: jump to actual method body
    L000d: call 0x727a8790 // else: call ??? what ???
    L0012: nop           // the actual method body
    L0013: nop           // the actual method body
    L0014: pop edi       /////////////////////
    L0015: pop ebp       // function frame teardown/exit
    L0016: ret           /////////////////////

What's the purpose of L0004 to L000d?

    L0004: cmp dword ptr [0x281dc19c], 0 // if (0 == ???)
    L000b: je short L0012  // then: jump to actual method body
    L000d: call 0x727a8790 // else: call ??? what ??

What is the called function?
Is it terminating the process? Why does C# JIT put this in every method?

I thought it might be something with inheritance, but i sealed the class and made the method static to eliminate that option.

Is it some kind of sanity check? like a check for:

  • method overload
  • corrupt code
  • stack overflow
  • segfault

The IL does not give a clue:

    .method public hidebysig static 
        void M () cil managed 
    {
        // Method begins at RVA 0x2069
        // Code size 2 (0x2)
        .maxstack 8

        IL_0000: nop
        IL_0001: ret
    }

Update:

Thx @Dai, setting it to release removes the code. To be sure that the complete removal was not cased by the empty method body i added a simple statement

    public static void M() {
        System.Console.WriteLine(7);
    }

jit in release mode results in (source):

C.M()
    L0000: mov ecx, 7
    L0005: call dword ptr [0x10a25768]
    L000b: ret

like @Dai said without the mentioned cmp-je


Solution

  • Let's play a fun game of "what's that doing in my code?"

    Step 1: Compile it locally

    I compiled this with csc /define:DEBUG; /debug+ /debug:portable (and other options).

    I added the blocking call to Console.ReadLine() so that we don't need to fiddle with setting a breakpoint in the debugger.

    using System;
    
    namespace JitHmm
    {
        class Program
        {
            static void Main( string[] args )
            {
                C.M();
            }
        }
    
        public static class C
        {
            public static void M()
            {
                Console.WriteLine( "Foo" );
                _ = Console.ReadLine();
            }
        }
    }
    
    

    Step 2: Start the program via WinDbg + SOS:


    1. As the program runs, it will print "Foo" to stdout then wait inside Console.ReadLine().

    2. If you tell WinDbg to break, it will show the .NET main thread (apphost) waiting inside NtReadFile (as Console.ReadLine is trying to read from stdin).

    3. Don't forget to load symbols, so WinDbg's Disassembly window will show you the resolved function names for call instructions instead of raw memory addresses.

    4. Open the Stack Trace window and walk-up to the frame that represents the M() function - this will be the frame right-before the first System_Console frame.

      • WinDbg doesn't seem to use CLR symbols to show the C# method names in the Stack window, but if you run !DumpStack -EE then you'll see C.M() in the dumped stack in the main command output window.
    5. Navigate to the C.M() function, the Disassembly window should show you roughly the same contents as Sharplab:

      00007ffe`9a615eff 005548           add     byte ptr [rbp+48h], dl
      00007ffe`9a615f02 83ec30           sub     esp, 30h
      00007ffe`9a615f05 488d6c2430       lea     rbp, [rsp+30h]
      00007ffe`9a615f0a 33c0             xor     eax, eax
      00007ffe`9a615f0c 8945fc           mov     dword ptr [rbp-4], eax
      00007ffe`9a615f0f 488945f0         mov     qword ptr [rbp-10h], rax
      00007ffe`9a615f13 833d16cb090000   cmp     dword ptr [7FFE9A6B2A30h], 0
      00007ffe`9a615f1a 7405             je      00007FFE9A615F21
      00007ffe`9a615f1c e8cf26c85f       call    00007FFEFA2985F0
      00007ffe`9a615f21 90               nop     
      00007ffe`9a615f22 33c9             xor     ecx, ecx
      00007ffe`9a615f24 894dfc           mov     dword ptr [rbp-4], ecx
      00007ffe`9a615f27 488b0c258030701a mov     rcx, qword ptr [1A703080h]
      00007ffe`9a615f2f e8acffffff       call    00007FFE9A615EE0
      00007ffe`9a615f34 90               nop     
      00007ffe`9a615f35 e836ffffff       call    00007FFE9A615E70
      00007ffe`9a615f3a 488945f0         mov     qword ptr [rbp-10h], rax
      
    6. Instruct WinDbg to load Symbols, after a brief wait you'll see the call 00007FFEFA2985F0 line in the disassembly window change to call coreclr!JIT_DbgIsJustMyCode (7ffefa2985f0)

    7. ...so just what is JIT_DbgIsJustMyCode?

    8. Run the command !u 00007ffe`9a615f13 - which will print an annotated disassembly of the function at 00007ffe`9a615f13, which shows me this:

      Normal JIT generated code
      JitHmm.C.M()
      ilAddr is 0000000000592064 pImport is 0000000002BFD460
      Begin 00007FFE9A615F00, size 46
      
      C:\git\_bollocks\JitHmm\Program.cs @ 16:
      00007ffe`9a615f00 55              push    rbp
      00007ffe`9a615f01 4883ec30        sub     rsp,30h
      00007ffe`9a615f05 488d6c2430      lea     rbp,[rsp+30h]
      00007ffe`9a615f0a 33c0            xor     eax,eax
      00007ffe`9a615f0c 8945fc          mov     dword ptr [rbp-4],eax
      00007ffe`9a615f0f 488945f0        mov     qword ptr [rbp-10h],rax
      >>> 00007ffe`9a615f13 833d16cb090000  cmp     dword ptr [00007ffe`9a6b2a30],0
      00007ffe`9a615f1a 7405            je      00007ffe`9a615f21
      00007ffe`9a615f1c e8cf26c85f      call    coreclr!GetCLRRuntimeHost+0x82700 (00007ffe`fa2985f0) (JitHelp: CORINFO_HELP_DBG_IS_JUST_MY_CODE)
      00007ffe`9a615f21 90              nop
      
      C:\git\_bollocks\JitHmm\Program.cs @ 17:
      00007ffe`9a615f22 33c9            xor     ecx,ecx
      00007ffe`9a615f24 894dfc          mov     dword ptr [rbp-4],ecx
      
      C:\git\_bollocks\JitHmm\Program.cs @ 19:
      00007ffe`9a615f27 488b0c258030701a mov     rcx,qword ptr [1A703080h] ("Foo")
      00007ffe`9a615f2f e8acffffff      call    00007ffe`9a615ee0
      00007ffe`9a615f34 90              nop
      
      C:\git\_bollocks\JitHmm\Program.cs @ 20:
      00007ffe`9a615f35 e836ffffff      call    00007ffe`9a615e70
      00007ffe`9a615f3a 488945f0        mov     qword ptr [rbp-10h],rax
      00007ffe`9a615f3e 90              nop
      
      C:\git\_bollocks\JitHmm\Program.cs @ 21:
      00007ffe`9a615f3f 90              nop
      00007ffe`9a615f40 488d6500        lea     rsp,[rbp]
      00007ffe`9a615f44 5d              pop     rbp
      00007ffe`9a615f45 c3              ret
      
    9. Note the (JitHelp: CORINFO_HELP_DBG_IS_JUST_MY_CODE) part - that's something we can search on .NET's main GitHub repo

    10. ...which resolves to this actual function baked-in to the CLR itself: void JIT_DbgIsJustMyCode()

    11. ...which is documented in debug/ee/debugger.h as well as coreclr/vm/jithelpers.cpp:

      // The jit injects probes in debuggable managed methods that look like:
      // if (*pFlag != 0) call JIT_DbgIsJustMyCode.
      // pFlag is unique per-method constant determined by GetJMCFlagAddr.
      // JIT_DbgIsJustMyCode will get the ip & fp and call OnMethodEnter.
      // pIP is an ip within the method, right after the prolog.
      
      // Callback for Just-My-Code probe
      // Probe looks like:
      //  if (*pFlag != 0) call JIT_DbgIsJustMyCode
      // So this is only called if the flag (obtained by GetJMCFlagAddr) is
      //  non-zero.
      
    12. Therefore, the mystery cmp and je instructions correspond to the emited instructions for if(*pFlag != 0).

    13. The actual GetJMCFlagAddr function is not provided by the CLR.


    So, what does it do?

    It allows your debugger to be informed when the program's execution-point reaches a user function (as opposed to a library-function) which is how the "Just my code" debugger option in Visual Studio works.

    ...though I'm unsure why they do it this way, instead of (for example) using only PDB Symbols.