About bytepattern 622c24
, there are 2 kind of case.
The first case : objdump
- as
pair.
objdump
disassembles 622c24
to : bound %ebp,(%esp)
as
assembles bound %ebp,(%esp)
to : 622c24
The second case : library Capstone
- keystone
pair.
Capstone
disassembles 622c24
to : bound (%esp), %ebp
Keystone
assembles bound (%esp), %ebp
to : 622c24
As you can see above, the position of source and destination is reversed.
bound %ebp,(%esp)
bound (%esp), %ebp
According to AT&T syntax, BOUND r32, m32
is correct.
Therefore, that means Capstone-keystone pare is the correct one.
Q. So, objdump
-as
has problem on disassembling bound
instruction?
Is it a bug of binutils?
Yes, this is probably a design bug in AT&T syntax. They normally follow the pattern of reversing the operands from Intel-syntax, and renaming the sign/zero-extension mnemonics (cdq
=> cltq
, movsx eax, byte[mem]
=> movsbl
). Deviations from that can be considered design bugs.
But not implementation bugs unless older versions were different. It's valid (but unpleasant) when AT&T just does whatever it wants and makes up its own rules for different instructions. This might be another case of compat with the original Unixware assembler. (see below).
The bound
instruction doesn't write either of its input operands, so neither one is really a destination. And unlike cmp
, operand order doesn't have any meaning. It just checks the register against both upper/lower bounds, and raises a #BR
exception if it's out of bounds.
There's only one opcode for it, and it requires register + memory operands (in the ModR/M r
and r/m
fields.
objdump -d
lists the register operand first in both AT&T and Intel syntax.
I assembled db 0x62, 0x2c, 0x24
with NASM and linked it with ld -melf_i386
into a 32-bit ELF executable (because I have a wrapper script that makes it easier to assemble+link+disassemble than to just assemble).
objdump -drwC -Mintel
8048060: 62 2c 24 bound ebp,QWORD PTR [esp]
objdump -drwC -Matt
8048060: 62 2c 24 bound %ebp,(%esp)
It does seem to be a quirk of AT&T syntax as implemented in binutils (as
/ objdump
/ gdb
) that bound
requires the register arg to be listed first.
bound %eax, (%edx) # assembles fine
bound (%edx), %eax # foo.s:2: Error: operand size mismatch for `bound'
I assume it's the same in Intel-syntax mode that it requires the register arg to be first. There's no ambiguity in meaning here, just an odd design choice to not reverse the operands vs. Intel syntax.
Related: AT&T syntax also has "bugs" according to the GAS manual:
9.15.16 AT&T Syntax bugs
The UnixWare assembler, and probably other AT&T derived ix86 Unix assemblers, generate floating point instructions with reversed source and destination registers in certain cases. Unfortunately, gcc and possibly many other programs use this reversed syntax, so we’re stuck with it.
For example
fsub %st,%st(3)
results in
%st(3)
being updated to%st - %st(3)
rather than the expected%st(3) - %st
. This happens with all the non-commutative arithmetic floating point operations with two register operands where the source register is%st
and the destination register is%st(i)
.
So AT&T syntax has actual bugs where both orders are valid and mean different things. I think we can group this operand "reversal" in with that.
ndisasm -b32
disassembles it as 622C24 bound ebp,[esp]
, matching the Intel manual's operand order.