I'm trying to disassemble an ELF executable which I compiled using arm-linux-gnueabihf
to target thumb-2
. However, ARM instruction encoding is making me confused while debugging my disassembler. Let's consider the following instruction:
mov.w fp, #0
Which I disassembled using objdump
and hopper
as a thumb-2 instruction. The instruction appears in memory as 4ff0000b
which means that it's actually0b00f04f
(little endian). Therefore, the binary encoding of the instruction is:
0000 1011 0000 0000 1111 0000 0100 1111
According to ARM architecture manual, it seems like ALL thumb-2 instructions should start with 111[10|01|11]
. Therefore, the above encoding doesn't correspond to any thumb-2 instruction. Further, it doesn't match any of the encodings found on section A8.8.102 (page 484).
Am I missing something?
I think you're missing the subtle distinction that wide Thumb-2 encodings are not 32-bit words like ARM encodings, they are a pair of 16-bit halfwords (note the bit numbering above the ARM ARM encoding diagram). Thus whilst the halfwords themselves are little-endian, they are still stored in 'normal' order relative to each other. If the bytes in memory are 4ff0000b
, then the actual instruction encoded is f04f 0b00
.