Search code examples
c++11asmjit

How does asmjit get code relocation when use AsmParser(&a).parse


I use asmjit in my c++ code and defined a function like below:

    // parse asm_str to byte code, return the length of byte code
    int assemble(bool isx64, unsigned long long addr, const char* asm_str, char buffer[MAX_INSTRUCTION_LENGTH])
    {
        // for test, I modified param's value
        isx64 = true;
        addr = 0x6a9ec0;
        asm_str = "call 0x00007FFF1CF8CEE0";

        auto arch = isx64 ? Arch::kX64 : Arch::kX86;

        // Initialize Environment with the requested architecture.
        Environment environment;
        environment.setArch(arch);

        // Initialize CodeHolder.
        CodeHolder code;
        Error err = code.init(environment, addr);

        if (err) {
            dbg_print_err("code.init failed, reason:%s", DebugUtils::errorAsString(err));
            return 0;
        }
        x86::Assembler a(&code);
        err = AsmParser(&a).parse(asm_str, strlen(asm_str));

        if (err) {
            dbg_print_err("AsmParser(&a).parse failed, asm_str=\"%s\" addr=0x%llx reason:%s", asm_str, addr, DebugUtils::errorAsString(err));
            return 0;
        }
        else {
            CodeBuffer& buf = code.sectionById(0)->buffer();
            memcpy(buffer, buf.data(), buf.size());
            print_byte_hex(buffer, buf.size());
            return (int)buf.size();
        }
    }

When I run this funciton and got the result of buffer is 40 E8 00 00 00 00 and not find any error. Actually, I known about that this instruction could not compile to byte code in addr(0x6a9ec0). So, I want to know how to determine if such instructions are compiled successfully in the code.

How to determine if such instructions are compiled with errors in the byte code.

I have more of these types of questions:

Following @Petr 's introduction, I have generated the following instruction bytecode:

FF 15 02 00 00 00 CC CC E0 CE F8 1C FF 7F 00 00

The result of disassembling this bytecode at this address using X64dbg is as follows:

00006a9ec0| FF15 02000000    | call qword ptr ds:[0x6a9ec8]
00006a9ec6| cc               | int 3
00006a9ec7| cc               | int 3
00006a9ec8| E0CEF81CFF7F0000 | dq 7FFF1CF8CEE0

However, doing so still does not solve the problem, because after the instruction completes this call, the bytecode at address 0x6a9ec6 will be considered as instruction to execut, which clearly does not comply with the logic of the program.

After searching for relevant information, I found that using bytecode directly to encode can obtain the correct logic. The specific method is to convert this call instruction into the following instructions:

7FFB694C0960   | FF15 04000000          | call qword ptr ds:[7FFB694C096a]
7FFB694C0966   | EB 0a                  | jmp 7FFB694C0972
7FFB694C0968   | 48 a1 0102030405060708 | mov rax, storage address for calls
7FFB694C0972   | 90                     | nop

As a beginner, there are still many similar problems that need to be solved. These issues all occur after moving a 64 bit instruction to another address, such as:

lea rax, ds:[rip+0x9DCAA]
mov rax,ds:[rip+0x100]
Near jump within a function
Far jump
Loop instruction
...

This has brought many difficulties to my assembly learning. My current solution is to replace these instructions accordingly without changing the original logic.

For example:

0x00007fff1f618d5f cmp byte ptr ds:[rip+0x1637C6], 0x0

will be replaced with:

push rax
mov rax,0x7fff1f77c52c
cmp byte ptr ds:[rax], 0x0
pop rax

Note: In here, rip+0x1637C6=0x7fff1f77c52c

I don't know if there are any side effects to doing this, and is there a better solution when using powerful ASmJIT?


Solution

  • The data stored in AsmJit's CodeHolder is not the final machine code until you call code.flatten() and code.relocateTo(...) - it's a machine code without relocations applied to it - that's why you see the initial 40 E8 .. .. .. .. sequence - REX prefix is added by AsmJit to make it possible to rewrite the call into something like call [rip + offset] later.

    To be able to do that AsmJit would insert a new section called .addrtab - this section would contain absolute addresses of jump and call targets that were outside of the 32-bit displacement range. This is essentially a convenience feature that lets users to use these two instructions comfortably without having to check whether they would be reachable, which could be tricky if you use JIT allocators that need to know the size of the code you allocate.

    You can access reloc entries by code.relocEntries() getter, or just call code.hasRelocEntries() to quickly check whether there are any.

    To add a bit more detail about how it works in this case. When call 0x00007FFF1CF8CEE0 is encountered, AsmJit inserts the 40 E8 00 00 00 00 machine code into the buffer (zero bytes are a placeholder now, nothing more), and adds a RelocEntry to CodeHolder. In addition, it also calls CodeHolder::addAddressToAddressTable(), which would insert the absolute address of the call there. The .addrtab section is lazily created, it would not exist if it was not required.

    This means that CodeHolder would have two sections now - .text and .addrtab - then at the end of machine code generation, these sections would have to be flattened - either by asmjit's CodeHolder::flatten() or by the user if a more complex logic is required to get the sections flattened.

    Then, after the code has been flattened, each section would have its own offset, so sections can be viewed now as spans in a buffer that can hold them all. This allows to finally call CodeHolder::relocateTo(), which would iterate over all RelocEntry records and apply them.

    In our case, since the .text buffer contains 40 E8 00 00 00 00 bytes, thus it has a size of 6 bytes, the .addrtab section would have offset 8 (it's aligned to 8 bytes) and the call instruction would be patched to something like call [rip + 2] - which points to the first record of .addrtab.

    So the final buffer containing flattened code could look like:

    • FF 15 02 00 00 00 (call [rip + 2])
    • .. .. (2 dummy bytes)
    • E0 CE F8 1C FF 7F 00 00 (absolute address the call points to).

    Answer update:

    If you know that the call address is out of range and you don't want to use an extra register to store the address to it, you can just use AsmJit to encode the sequence that would work without relying on .addrtab - something like this should work out of box:

    using namespace asmjit;
    
    void encodeCall(x86::Assembler& assembler, uint64_t address) {
      Label a = assembler.newLabel();
      Label b = assembler.newLabel();
      assembler.call(x86::ptr(a)); // Would be encoded as [rip+offset].
      assembler.jmp(b);
    
      assembler.bind(a);
      assembler.dq(address);
    
      assembler.bind(b);
    }
    

    Otherwise, my recommendation is to always have the address in a register - it's the simplest way that works the best:

    using namespace asmjit;
    
    void encodeCall(x86::Assembler& assembler, x86::Gp tmpReg, uint64_t address) {
      assembler.mov(tmpReg, address);
      assembler.call(tmpReg);
    }