I'm using an ST32F401RE (ARM Cortex -M4 32-bit RISC) and was curious about the following.
Normally instructions on a 32 bit ARM can be 2 byte or 4 byte long. I accidentally jumped in-between a 2 byte instruction and the Microprocessor instantly went into an infinite Error Handler loop afterwards.
I later tested this and jumped on purpose in-between a 4 byte and 2 byte instruction and the Microprocessor would always go into the Error Handler.
I used the following c code to jump into Memory Adresses.
void (*foo)(void) = (void (*)())0x80002e8;
foo( ) ;
The Adresses for functions and instructions are from the Disassembly. The Compiler used the following assembler instruction after storing the adress in r3.
blx r3
Question: How exactly can the Microprocessor tell that it didn't start at the beginning of an instruction but actually started in-between one?
Especially in case of the 16 bit thumb instructions which are already pretty cramped.
I have multiple guesses but want to know what exactly is going on.
Normally instructions on a 32 bit ARM can be 2 byte or 4 byte long.
Only for Thumb2; on Thumb they are all 2 bytes, and on ARM ("A32") mode they are all 4 bytes.
Question: How exactly can the Microprocessor tell that it didn't start at the beginning of an instruction but actually started in-between one?
It can't. If the 2 upper bytes of a 4-byte instruction happen to form a valid 2-byte instruction and you jump there, it will be executed as such. In your case, these upper 2 bytes probably were all invalid instructions, resulting in a fault exception.
For example, the program
.code 16
.syntax unified
test4byte:
mov.w r0, #0x88000000
test2byte:
ands r0, r1
will be assembled into
00000000 <test4byte>:
0: f04f 4008 mov.w r0, #2281701376 ; 0x88000000
00000004 <test2byte>:
4: 4008 ands r0, r1
or as a byte-wise hex dump
4f f0 08 40 08 40
As you see, the sequence 08 40
occurs twice - both as the upper 2 bytes of the mov.w
and as the ands
instruction, both of which are identical. So, the processor has no way to tell these apart.
In a program that just contained the shown mov.w
instruction, if you jumped to address 0
, the mov.w
would be executed; if you jumped to address 2
, an ands
would be executed, even though it doesn't appear in the assembly code.