assembly x86-64 machine-code opcode instruction-encoding

Opcode differences between MOV r/m32, imm32 and MOV r32, imm32

These are MOV instruction opcodes from the Intel® 64 and IA-32 Architectures Software Developer Manuals:

B8+ rd id MOV r32, imm32 OI Valid Valid Move imm32 to r32.

C7 /0 id MOV r/m32, imm32 MI Valid Valid Move imm32 to r/m32.

I disassembled as follows:

0:  b8 44 33 22 11          mov    eax, 0x11223344

0:  67 c7 00 44 33 22 11    mov    DWORD PTR[eax], 0x11223344

The questions that I want to ask are:

Why is the C7 opcode register/memory (r/m32, imm32) instead of memory-only (m32, imm32)?
Is there anytime that we use C7 for register-only (r32, imm32) instead of using B8 ?

Solution

Why is opcode C7 r/m32, imm32 instead of memory-only m32,imm32?

Because it would probably take extra transistors to special-case it and #UD fault on the ModRM.mode = 11 (register destination), instead of just running it like other instructions with a write-only destination like mov r/m32, r32.

Is there anytime that we use C7 for mov r32, imm32 instead of using B8?

In 32-bit mode when this design choice was made, no. Except for alignment of later code instead of a separate nop - What methods can be used to efficiently extend instruction length on modern x86?

In 64-bit mode, you'd use C7 with a register destination, but only with a REX.W prefix. The shortest encoding for mov rax, -123 is REX.W mov r/m64, sign_extended_imm32. REX.W B8+rd is 10-byte mov r64, imm64.

Of course, for 64-bit values that fit in 32-bits zero-extended, like mov rax, 0x0000000012345678, you should actually use 5-byte mov eax, 0x12345678. NASM will do this for you by default, so will GAS with as -Os or gcc -Wa,-Os. Other assemblers won't, so it's up to the programmer to use 32-bit operand-size for non-huge non-negative 64-bit constants.

See examples and more details in

BTW, it's weird to use [eax] as the destination in 64-bit mode; 64-bit address size like [rax] has more compact machine code, and normally your addresses are properly extended to 64-bit registers even if you were packing them into narrower storage. That's why your disassembly has an extra 67 byte.