Search code examples
jvmexecutablejitmachine-code

How do JIT compilers stay ahead of executing machine code?


If I understand correctly, JIT compilers compile code (often bytecode) into native machine code on the fly and insert it into the proper spot in known memory.

Once that process is started, how does the JIT compiler stay ahead of the machine code that's executing? How can it be ensured that the executing code won't come across blank memory where it was pointed with a GOTO or equivalent because the JIT hasn't figured out what to put there next?

For instance, given some (fake) bytecode:

03 01 move variable 1 onto the stack
b3 02 do something with the contents

After generating the first line of native code and placing in next in line to be run, I'm assuming the JIT will give the native code a "GOTO" to an empty set of memory in which to run the next batch of instructions. But what if the machine code gets there before the JIT compiler has had time to put the machine code for line 2 in that slot?


Solution

  • Correctness is ensured by following two rules:

    Never allow execution of unfinished code

    A JIT compiler will first finish compiling whatever region of code it works on, this could be a basic block, a function or an arbitrary trace through the code. Only after it is finished, it will allow the processor to execute that code. So execution never comes across an unfinished translation.

    Do not generate undefined jumps

    Whenever the JIT compiler comes across a jump that leaves the region of code being compiled, it will generate a jump back to the interpreter code that figures out where to continue execution, possibly by compiling some other region of code, but never an undefined location. The same is done at the end of the compiled region.

    Some JITs also compile to functions that follow the calling conventions of the machine and can thus just use an ordinary return (LLVM JIT would be an example of that). In this case the "JITed" code is just called via a function pointer, and the code just returns to the caller which is the interpreter.

    Other JIT-compilers generate a custom prologue and epilogue to the generated code which ensures that the processor is in a defined state after the execution of jit-code and all information needed to continue execution is available.

    As an optimization, the JIT may notice that the jump goes to code that has already been JIT-compiled, or is statically precompiled (e.g. a library function), and emit a direct jump there, or they can create a jump instruction that can later be patched to go to a newly compiled piece of code (QEMU does that).