Search code examples
assemblycompiler-constructionx86-64jitmachine-code

Jumps for a JIT (x86_64)


I'm writing a JIT compiler in C for x86_64 linux.

Currently the idea is to generate some bytecode in a buffer of executable memory (e.g. obtained with an mmap call) and jump to it with a function pointer.

I'd like to be able to link multiple blocks of executable memory together such that they can jump between each other using only native instructions.

Ideally, the C-level pointer to an executable block can be written into another block as an absolute jump address something like this:

unsigned char *code_1 = { 0xAB, 0xCD, ... };
void *exec_block_1 = mmap(code1, ... );
write_bytecode(code_1, code_block_1);
...
unsigned char *code_2 = { 0xAB, 0xCD, ... , exec_block_1, ... };
void *exec_block_2 = mmap(code2, ... );
write_bytecode(code_2, exec_block_2); // bytecode contains code_block_1 as a jump
                                      // address so that the code in the second block
                                      // can jump to the code in the first block

However I'm finding the limitations of x86_64 quite an obstacle here. There's no way to jump to an absolute 64-bit address in x86_64 as all available 64-bit jump operations are relative to the instruction pointer. This means that I can't use the C-pointer as a jump target for generated code.

Is there a solution to this problem that will allow me to link blocks together in the manner I've described? Perhaps an x86_64 instruction that I'm not aware of?


Solution

  • Hmm I'm not sure if I clearly understood your question and if that a proper answer. it's quite a convoluted way to achieve this:

        ;instr              ; opcodes [op size] (comment)
        call next           ; e8 00 00 00 00 [4] (call to get current location)
    next:
        pop rax             ; 58 [1]  (next label address in rax)
        add rax, 12h        ; 48 83 c0 12 [4] (adjust rax to fall on landing label)
        push rax            ; 50 [1]  (push adjusted value)
        mov rax, code_block ; 48 b8 XX XX XX XX XX XX XX XX [10] (load target address)
        push rax            ; 50 [1] (push to ret to code_block)
        ret                 ; c3 [1] (go to code_block)
    landing:    
        nop
        nop
    

    e8 00 00 00 00 is just there to get the current pointer on top of stack. Then the code adjusts rax to fall on landing label later. You'll need to replace the XX (in mov rax, code_block) by the virtual address of code block. The ret instruction is used as a call. When caller returns, the code should fall on landing.

    Is that this kind of thing you're trying to achieve?