I'm familiar with r/m8, r/m16, imm16, etc but how do I encode m16:16, m16:32, and m16:64? These are in the JMP and CALL instructions...
Is m16:16 an address location? Or is it like an immediate address? Any help would be greatly appreciated!
"encode" would normally mean machine-code bytes. But I think you're asking about assembler syntax, because Intel's manuals are clear about the machine code. (See the entry for jmp
, or the rest of Intel's vol.2 instruction set reference manual for more about how an entry is formatted and what stuff means.)
jmp m16:64
is a memory-indirect far jump, with a new RIP and CS value (in that order because x86 is little-endian).
Just like a memory-indirect near jump, you simply supply an addressing mode, and the CPU loads the memory operand from there. But it's a 10-byte memory operand instead of 8 for a near jump.
You can use any addressing mode. I used [rdi]
for simplicitly. All of this would be the same with call far
/ lcall
as well.
NASM syntax:
jmp far [rdi] ; for YASM, you need a manual REX prefix somehow
AT&T syntax:
rex64 ljmp *(%rdi) # force REX prefix which buggy GAS omits
ljmpq *(%rdi) # clang accepts this, GAS doesn't.
GAS .intel_syntax noprefix
disassembly for `objdump -drwC -Mintel:
400080: 48 ff 2f rex.W jmp FWORD PTR [rdi]
Or from llvm-objdump -d
into AT&T syntax:
400080: 48 ff 2f ljmpq *(%rdi)
GNU Binutils wrong, it needs a 48
REX.W prefix to set the operand-size to 64-bit. (Of the memory source operand, I think.)
FWORD (48-bit far-word = m16:32) might actually be correct disassembly without a REX prefix, which is why it isn't what we want and why it crashes without a REX.W if the pointed-to memory is actually an m16:64
. We want 48 ff 2f
for a TWORD (m16:64) memory operand.
GAS won't assemble ljmpq *(%rdi)
, but clang will.
For example, to set CS=si and RIP=rdi
; NASM syntax
mov [rsp], rdi
mov [rsp+8], si ; new CS value goes last because x86 is little-endian
jmp far [rsp] ; loads 10 byte from memory
or push rsi
/ push rdi
/ jmp far [rsp]
, or any other memory location you want to use.
NASM knows that a far jmp requires a REX.W prefix, unlike YASM and GNU Binutils. It uses
; assembled by NASM (not YASM), disassembled with objdump -drwC -Mintel
400080: 48 ff 2f rex.W jmp FWORD PTR [rdi]
printf '\xff\x2f' | ndisasm -b64 -
shows us NASM's disassembly output:
; ndisasm -b64 output thinks it's a dword (m16:16)?
00000000 FF2F jmp dword far [rdi]
Intel's manual entry lists jmp m16:64
as requiring a REX.W prefix, but GAS / binutils incorrectly thinks that's not necessary. See also discussion on https://lkml.org/lkml/2012/12/23/164 about Linux kernel code usage of lret
vs. rex64 ljmp *initial_code(%rip)
, and guesswork as to whether AMD CPUs support FF /5
with a REX.W prefix. Since AMD documentation doesn't explicitly mention it.
I tested this in a static-pie executable on GNU/Linux (so it would be loaded outside the low 32 bits), on an Intel i7-6700k Skylake:
default rel
foo:
mov eax, 231
syscall ; exit_group(edi)
global _start
_start:
mov eax, cs
push rax ; push cs is gone in x86-64
lea rax, [foo]
push rax
call far [rsp]
$ nasm -felf64 farjmp.asm # or yasm
$ gcc -nostdlib -static-pie farjmp.o -o farjmp
$ ./farjmp
or gdb ./farjmp
call far [rsp]
.foo:
jmp far ptr16:64
doesn't exist, and ptr16:32
or ptr16:16
aren't usable in 64-bit mode. That would be a 10-byte immediate (direct) absolute jump target. x86-64 can't use absolute direct jumps at all: there's no way to encode a new CS or RIP into a jmp
instruction.
Direct near jumps use a rel32
or rel8
, and of course they can't change CS. (That's what near means).
32-bit mode has jmp far ptr16:32
(with a 6-byte immediate).
There aren't many use-cases for jmp far
, especially in 64-bit mode. In a kernel, you'd use iret
or sysret
to return to 32-bit user-space, and there's usually no other reason to switch code segments. I guess you could have your kernel switch to 32-bit mode within the kernel.