I understand that the ret
imm16
(C2 imm16
) instruction with an operand of zero is no different than the operandless ret
(C3
) in its effect. However, when I explicitly give my assembler ret 0
, should it not encode that as the ret imm16
instruction since I explicitly provided the operand?
If I assemble the following code with the version of ml.exe that ships with VS2019 with the command ml file.asm /link /SUBSYSTEM:CONSOLE /ENTRY:stdMain
.386
.MODEL FLAT, STDCALL
.CODE
stdMain PROC
xor eax, eax
ret 0
stdMain ENDP
END
Then open the executable with a disassembler, I see the instruction that was encoded for ret
was C3
:
00401000: 33 C0 xor eax,eax
00401002: C3 ret
I can manually enforce the C2
instruction by hard coding the bytes for it:
.386
.MODEL FLAT, STDCALL
.CODE
stdMain PROC
xor eax, eax
db 0c2h, 0, 0 ; ret imm16=0
stdMain ENDP
END
Now I see the C2
instruction in the disassembled output:
00401000: 33 C0 xor eax,eax
00401002: C2 00 00 ret 0
Is it correct for an assembler to 'optimize' like that?
You don't need 3 separate db
lines; one db
with 3 operands is equivalent:
db 0c2h, 0, 0 ; ret imm16=0
Is it correct for an assembler to 'optimize' like that?
In general yes, it's accepted that assemblers can use the shortest encoding of an instruction that has exactly the same architectural effect, and has the same mnemonic.
e.g. NASM will optimize mov rax, 123
into mov eax, 123
, even though some others (like YASM or GAS) don't by default. (GAS has a -Os
option which GCC doesn't pass to it by default). Also NASM will optimize lea eax, [rax*2 + 123]
to lea eax, [rax + rax + 123]
unless you use [NOSPLIT 123 + rax*2]
to spend more code size on a disp32 for the benefit of avoiding a slower 3 component LEA.
NASM doesn't optimize xor rax,rax
to xor eax,eax
, though; I guess it doesn't check for zeroing idioms (both regs the same) with XOR.
NASM has a -O0
option to not optimize, but that's very bad, e.g. mov rax, -1
is 10 bytes (imm64) instead of 7 (sign_extended_imm32), add ecx, 123
uses an imm32, and jmp foo
uses rel32 instead of rel8 even if the label was nearby. (This used to be the default in old NASM versions. https://nasm.us/doc/nasmdoc2.html#section-2.1.24)
MSVC always emits ret
as ret 0
in asm listings, so if you're ever assembling code like that you definitely want the assembler to optimize it to normal ret
. Apparently this optimization is one that MS thinks it's normal to rely on.
Seems like a dumb design to ever write or emit ret 0
when you want ret
, but that's what MSVC does. (Not that MSVC works by feeding asm to MASM; it emits machine code directly unless you ask for an asm listing.)
NASM does happen to assemble ret 0
to ret imm16=0
, so you might prefer using it. I know I'd pick NASM over MASM any time I had a choice; simple syntax and free from magic rules about memory operands imply operand-sizes, and sometimes []
not meaning anything...