In the llvm tutorials and examples, the compiler outputs LLVM IR by making calls like this
return Builder.CreateAdd(L, R, "addtmp");
but many interpreters are written like this:
switch (opcode) {
case ADD:
result = L + R;
break;
...
How would you extract each of these code snippets to make a JIT with LLVM without having to re-implement each opcode in LLVM IR?
Okay, first take all of your code snippets and refactor them into their own functions. So your code goes to:
void addOpcode(uint32_t *result, uint32_t L, uint32_t R) {
*result = L + R;
}
switch (opcode) {
case ADD:
addOpcode(&result, L, R);
break;
....
Okay, so after doing this your interpreter should still run. Now take all the new functions and place them in their own file. Now compile that file using either llvm-gcc or clang, and instead of generating native code compile it using the "cpp" backend (-march -cpp). That will generate C++ code that instantiates the byte code for the compilation unit. You can specify options to limit it to specific functions, etc. You probably want to use "-cppgen module" .
Now back your interpreter loop glue together calls to the generated C++ code instead of directly executing the original code, then pass it to some optimizers and a native codegenerator. Gratz on the JIT ;-) You can see an example of this in a couple of LLVM projects, like the vm_ops in llvm-lua.