Search code examples
x86armreverse-engineering

What happened if assembly code jump to a address contain bad instruction?


I wonder what happened if the assembly code jump to a address that contain a bad instruction ?
I read from this blog, that when jmpq jumping to a address contain bad instruction.
the machine response by not jumping at all and simply executing next instruction (or I misunderstand this blog) , for example :

jmpq abc
[ code ]
...
abc : 
[ bad instruction ]

It would simply executing [ code ]

But this blog is all about x86 instruction set. I wonder what happened to arm instruction set.

The reason I ask this question is because I recently working on a project that is a unity IL2CPP based apk and its is been obfuscated.

I decompile it and found that there are many bl instruction jump to bad instructions.

I wonder what happend if they jump to bad instruction or the execution contain bad instruction.

enter image description here

ps : the decompiler I use is Ghidra


Solution

  • bits is bits.

    The processor cannot possibly know that an address points at a bad instruction. Processors are incredibly dumb. They do what they are told, what they are programmed to do. Just like a train on tracks if you happen to leave a gap in one or both tracks or the tracks do not line up the train is probably going to crash. Or it might roll along upright until it hits a house or something.

    The processor (arm, intel, etc are irrelevant same answer) will take the next byte(s) it finds per its rules (linear execution, branching, etc) and try to decode and execute them as an instruction. If those bytes are "bad" as in an invalid instruction, then some/many/most processors will raise an exception and do the per-ISA-defined solution (call an exception handler, hang, reset, etc). If the bytes are bad as in not the instruction you intended but the bit/byte pattern just happens to be a valid instruction. It will execute it because processors are very very very dumb, they do what they are programmed to do, no exceptions.

    So there is no wondering what will happens if...The processor will try to execute the bytes/bits found as it does for every single instruction cycle, branch or no branch. If the encoded branch address violates the ISA then same answer it will do whatever the ISA has defined for that fault.

    Now on to disassemblers. Any variable length instruction set (x86 definitely, ARM with arm and thumb and thumb2 also a problem) assume you cannot disassemble and assume the disassembly is bad. Put very very very little faith in instructions that look bad or that are going off in the weeds (bl to bad places, the bl disassembly itself may be the problem not the destination). The only good way to deal with a variable length instruction set is to disassemble from a known good entry point and in execution order not linearly through memory. And with that and particularly with ARM but also others, you will end up with a good portion of the binary as unable to be disassembled because you cannot statically determine some of the execution paths, you have to actually execute, simulate or as a human visually examine, the code to find some of the execution paths. And some disassemblers are worse than others and combinations if disassemblers and instruction sets make for unusuable output. It is pretty easy to watch gnu objdump fail miserably with x86 code. If you know what you are doing you can make the objdump output absolutely dreadful (for x86) and not even remotely close to being correct. Arm with thumb and thumb2 same answer. risc-v, etc.