Search code examples
compiler-constructioncompilationllvmjitmachine-code

How does JIT compilation actually execute the machine code at runtime?


I understand the gist of how JIT compilation works (after reading such resources as this SO question). However, I am still wondering how does it actually execute the machine code at runtime?

I don't have a deep background in operating systems or compiler optimizations, and haven't done anything with machine code directly, but am starting to explore it. I have started playing around in assembly, and see how something like NASM can take your assembly code and compile it to machine code (the executable), and then you can "invoke" it from the command line like ./my-executable.

But how is a JIT compiler actually doing that at runtime? Is it like streaming machine code into stdin or something, or how does it work? If you could provide an example or some pseudocode of how some assembly (or something along those lines, not as high level as C though) might look to demonstrate the basic flow, that would be amazing too.


Solution

  • You mentioned that you played around with assembly so you have some idea how that works, good. Imagine that you write code that allocates a buffer (ex: at address 0x75612d39). Then your code saves the assembly ops to that buffer to pop a number from the stack, the assembly to call a print function to print that number, then the assembly to "return". Then you push the number 3 onto the stack, and call/jump to address 0x75612d39. The processor will obey the instructions to print your numbers, then return to your code again, and continue. At the assembly level it's actually pretty straightforward.

    I don't know any "real" assembly languages, but here's a "sample" cobbled together from a bytecode I know. This machine has 2 byte pointers, the string %s is located at address 6a, and the function printf is located at address 1388.

    void myfunc(int a) {
        printf("%s", a);
    }
    

    The assembly for this function would look like this:

    OP Params OpName     Description
    13 82 6a  PushString 82 means string, 6a is the address of "%s"
                         So this function pushes a pointer to "%s" on the stack.
    13 83 00  PushInt    83 means integer, 00 means the one on the top of the stack.
                         So this function gets the integer at the top of the stack,
                         And pushes it on the stack again
    17 13 88 Call        1388 is printf, so this calls the printf function
    03 02    Pop         This pops the two things we pushed back off the stack
    02       Return      This returns to the calling code.
    

    So when your JITTER reads in the void myfunc(int a) {printf("%s", a);}, it allocate memory for this function (ex: at address 0x75612d39), and store these bytes in that memory: 13 82 6a 13 83 00 17 13 88 03 02 02. Then, to call that function, it simply jumps/calls the function at address 0x75612d39.