Search code examples
assemblyx86-64machine-codeinstruction-encoding

Encoding JMP FAR and CALL FAR in x86-64


I'm familiar with r/m8, r/m16, imm16, etc but how do I encode m16:16, m16:32, and m16:64? These are in the JMP and CALL instructions...

Is m16:16 an address location? Or is it like an immediate address? Any help would be greatly appreciated!


Solution

  • "encode" would normally mean machine-code bytes. But I think you're asking about assembler syntax, because Intel's manuals are clear about the machine code. (See the entry for jmp, or the rest of Intel's vol.2 instruction set reference manual for more about how an entry is formatted and what stuff means.)


    jmp m16:64 is a memory-indirect far jump, with a new RIP and CS value (in that order because x86 is little-endian).

    Just like a memory-indirect near jump, you simply supply an addressing mode, and the CPU loads the memory operand from there. But it's a 10-byte memory operand instead of 8 for a near jump.

    You can use any addressing mode. I used [rdi] for simplicitly. All of this would be the same with call far / lcall as well.

    NASM syntax:

    jmp far [rdi]        ; for YASM, you need a manual REX prefix somehow
    

    AT&T syntax:

    rex64 ljmp *(%rdi)        # force REX prefix which buggy GAS omits
    ljmpq *(%rdi)             # clang accepts this, GAS doesn't.
    

    GAS .intel_syntax noprefix disassembly for `objdump -drwC -Mintel:

      400080:       48 ff 2f        rex.W jmp FWORD PTR [rdi]
    

    Or from llvm-objdump -d into AT&T syntax:

      400080:  48 ff 2f             ljmpq   *(%rdi)
    

    GNU Binutils wrong, it needs a 48 REX.W prefix to set the operand-size to 64-bit. (Of the memory source operand, I think.)

    FWORD (48-bit far-word = m16:32) might actually be correct disassembly without a REX prefix, which is why it isn't what we want and why it crashes without a REX.W if the pointed-to memory is actually an m16:64. We want 48 ff 2f for a TWORD (m16:64) memory operand.

    GAS won't assemble ljmpq *(%rdi), but clang will.


    For example, to set CS=si and RIP=rdi

    ; NASM syntax
    mov   [rsp], rdi
    mov   [rsp+8], si     ; new CS value goes last because x86 is little-endian
    jmp far  [rsp]       ; loads 10 byte from memory
    

    or push rsi / push rdi / jmp far [rsp], or any other memory location you want to use.


    NASM knows that a far jmp requires a REX.W prefix, unlike YASM and GNU Binutils. It uses

    ; assembled by NASM (not YASM), disassembled with objdump -drwC -Mintel
    400080:       48 ff 2f                rex.W jmp FWORD PTR [rdi]
    

    printf '\xff\x2f' | ndisasm -b64 - shows us NASM's disassembly output:

    ; ndisasm -b64 output thinks it's a dword (m16:16)?
    00000000  FF2F              jmp dword far [rdi]
    

    Intel's manual entry lists jmp m16:64 as requiring a REX.W prefix, but GAS / binutils incorrectly thinks that's not necessary. See also discussion on https://lkml.org/lkml/2012/12/23/164 about Linux kernel code usage of lret vs. rex64 ljmp *initial_code(%rip), and guesswork as to whether AMD CPUs support FF /5 with a REX.W prefix. Since AMD documentation doesn't explicitly mention it.


    Experimental test: REX prefix needed for far jump/call

    I tested this in a static-pie executable on GNU/Linux (so it would be loaded outside the low 32 bits), on an Intel i7-6700k Skylake:

    default rel
    foo:
        mov  eax, 231
        syscall              ; exit_group(edi)
    
    global _start
    _start:
    
        mov  eax, cs
        push rax             ; push cs is gone in x86-64
        lea  rax, [foo]
        push rax
        call far [rsp]
    
    $ nasm -felf64 farjmp.asm          # or yasm
    $ gcc -nostdlib -static-pie farjmp.o  -o farjmp
    $ ./farjmp
    or  gdb ./farjmp
    
    • Assembled by YASM (with no REX.W), it segfaults on call far [rsp].
    • Assembled by NASM, (with a REX.W), it successfully reaches foo:

    jmp far ptr16:64 doesn't exist, and ptr16:32 or ptr16:16 aren't usable in 64-bit mode. That would be a 10-byte immediate (direct) absolute jump target. x86-64 can't use absolute direct jumps at all: there's no way to encode a new CS or RIP into a jmp instruction.

    Direct near jumps use a rel32 or rel8, and of course they can't change CS. (That's what near means).

    32-bit mode has jmp far ptr16:32 (with a 6-byte immediate).

    There aren't many use-cases for jmp far, especially in 64-bit mode. In a kernel, you'd use iret or sysret to return to 32-bit user-space, and there's usually no other reason to switch code segments. I guess you could have your kernel switch to 32-bit mode within the kernel.