Search code examples
compiler-constructionjitllvmlanguage-implementation

How would you re-use C opcode implementations when writing a JIT with LLVM?


In the llvm tutorials and examples, the compiler outputs LLVM IR by making calls like this

return Builder.CreateAdd(L, R, "addtmp");

but many interpreters are written like this:

switch (opcode) {
     case ADD:
             result = L + R;
             break;
     ...

How would you extract each of these code snippets to make a JIT with LLVM without having to re-implement each opcode in LLVM IR?


Solution

  • Okay, first take all of your code snippets and refactor them into their own functions. So your code goes to:

    void addOpcode(uint32_t *result, uint32_t L, uint32_t R) {
        *result = L + R;
    }
    
    switch (opcode) {
        case ADD:
                addOpcode(&result, L, R);
                break;
         ....
    

    Okay, so after doing this your interpreter should still run. Now take all the new functions and place them in their own file. Now compile that file using either llvm-gcc or clang, and instead of generating native code compile it using the "cpp" backend (-march -cpp). That will generate C++ code that instantiates the byte code for the compilation unit. You can specify options to limit it to specific functions, etc. You probably want to use "-cppgen module" .

    Now back your interpreter loop glue together calls to the generated C++ code instead of directly executing the original code, then pass it to some optimizers and a native codegenerator. Gratz on the JIT ;-) You can see an example of this in a couple of LLVM projects, like the vm_ops in llvm-lua.