I'm starting to use the Intel reference page to look up and learn about the op codes (instead of asking everything on SO). I'd like to make sure that my understanding is OK and ask a few questions on the output between a basic asm program and the intel instruction codes.
Here is the program I have to compare various mov
instructions into the rax
-ish register (is there a better way to say "rax" and its 32- 16- and 8- bit components?):
.globl _start
_start:
movq $1, %rax # move immediate into 8-byte rax (rax)
movl $1, %eax # move immediate into 4-byte rax (eax)
movw $1, %ax # move immediate into 2-byte rax (ax)
movb $1, %al # move immediate into 1-byte rax (al)
mov $60, %eax
syscall
And it disassembles as follows:
$ objdump -D file
file: file format elf64-x86-64
Disassembly of section .text:
0000000000400078 <_start>:
400078: 48 c7 c0 01 00 00 00 mov $0x1,%rax
40007f: b8 01 00 00 00 mov $0x1,%eax
400084: 66 b8 01 00 mov $0x1,%ax
400088: b0 01 mov $0x1,%al
40008a: b8 3c 00 00 00 mov $0x3c,%eax
40008f: 0f 05 syscall
Now, matching up to the intel codes from MOV
, copied here:
I am able to reconcile the following of the four instructions:
mov $0x1,%al
--> b0 01
b0
[+ 1 byte for value] for 1-byte move immediate.mov $0x1,%eax
--> b8 01 00 00 00
b8
[+ 4 bytes for value] for 1-byte move immediate.mov $0x1,%ax
--> 66 b8 01 00
b8
not 66 b8
.mov $0x1,%rax48
--> c7 c0 01 00 00 00
From this, my question related to this are:
mov $0x1,%ax
match up?64
-bit codes, or what's the suggested way to look that up?%ebx
or %r11
instead. How do you calculate the 'code-adjustment', as it looks like in this lookup table it only gives (I think?) the eax
register for the 'register example codes'.You're missing the (concept of) prefix "opcodes" that change the meaning of the following instruction. Volume 2, sections 2.1.1 and 2.2.1 of the IA32 manual covers this. From 2.1.1 we get:
Operand-size override prefix is encoded using 66H (66H is also used as a mandatory prefix for some instructions).
so the 66 prefix changes the operand size from the default 32-bit to 16-bit. Thus, the mov $1,%ax
(16-bit) is the same as mov $1,%eax
(32-bit) with just the 66 prefix
The last case (mov $1, %rax
) is actually using a different instruction
REX.W + C7 /0 io MOV r/m64, imm32 Move imm32 sign extended to 64-bits tor/m64.
here we're moving a constant into any register instead of A -- the instruction is one byte larger but allows moving a 32-bit immed into a 64-bit register, so only needs a 4-byte constant instead of an 8-byte one (so ends up being 3 bytes smaller than the equivalent 48 b8 01 00 00 00 00 00 00 00)