Search code examples
assemblyx86x86-64disassemblymachine-code

Is it possible to decode x86-64 instructions in reverse?


I was wondering if it is possible to decode x86-64 instructions in reverse?

I need this for a runtime dissembler. Users can point to a random location in memory and then should be able to scroll upwards and see what instructions came before the specified address.

I want to do this by reverse decoding.


Solution

  • The basic format of x86 instructions is like this

    x86 instruction format

    Modern CPUs can support VEX and EVEX prefixes. In x86-64 there might also be the REX prefix at the beginning

    Looking at the format it can easily be seen that the instructions aren't palindromes and you can't read from the end.


    Regarding determining which instruction an arbitrary address belongs to, unfortunately it can't be done either, because x86 instructions are not self-synchronizable, and (generally) not aligned. You have to know exactly the begin of an instruction, otherwise the instruction will be decoded differently.

    You can even give addresses that actually contain data and the CPU/disassembler will just decode those as code, because no one knows what those bytes actually mean. Jumping into the middle of instructions is often used for code obfuscation. The technique has also been applied for code size saving in the past, because a byte can be reused and has different meanings depending on which instruction it belongs to

    That said, it might be possible to guess in many cases since functions and loops are often aligned to 16 or 32 bytes, with NOPs padding around