Search code examples
arminterruptarmv7thumbcortex-r

Why the Link register in FIQ mode will be instruction address plus 4 in thumb mode instead of instruction address plus 2?


I am going through the Cortex R5 technical reference manual (version: r1p2). And according to the manual the

  • LR_SVC will have IA + 4 in ARM mode and IA+2 in Thumb mode, which I can understand since the size of instructions in Thumb is 2bytes.
  • So I was expecting the same behavior for IRQ mode also. But it states that LR_IRQ will have IA + 4 in both ARM and Thumb modes. Can any one explain me why is this the case with IRQ mode?

I would appreciate if you could also explain me why the LR_DABT will have IA + 8 in ARM/Thumb.

  • From my understanding I am guessing that DABT will happen in load-store unit, and the PC in data processing unit would be IA + 8. (also this sits good in ARM mode, I was not able to understand why it is IA + 8 in thumb mode). Please correct me if I am wrong.

You can refer the below table for LR values in other mode. Recommended exception exit Table3-4 in Cortex R5 TRM r1p2


Solution

  • Why the Link register in FIQ mode will be instruction address plus 4 in thumb mode instead of instruction address plus 2?

    TL;DR - simplicity.

    Because they don't care about making a system programmers life easy as compared to making the code correct. In order to implement what you want, they would have to inspect the mode. What if the interrupted instruction was changing the mode? What is the correct thing to do.

    I would appreciate if you could also explain me why the LR_DABT will have IA + 8 in ARM/Thumb

    It is a similar reason and as you stated. The CPU is pipelined and there are several instructions which are partially complete in different stages. The CPU needs to either discard or undo the effects of these instruction as they will re-run when resuming.

    For a data abort, the load/store units are later stages, so pre-feteches and effects of other units are already complete. It is common to interleave non-load/store instructions with load/store (which cause the data abort). By the time a store gets out on a bus to an end device, the CPU can already accomplish a lot. To have the 'wish' that DABT+4 is in LR, you would have to accept diminished performance. Now do you mind?


    A final reason is compatibility with older designs. These offset are also the same for Cortex-A and ARMv4,ARMv5 and ARMv6. It allows kernels to run on multiple CPU types without so much conditionalization. Run-time CPU identification is often difficult as you need fixed entries in the vector table; as well as increasing IRQ latency. This table is often read-only as it is important that it is not compromised, so allowing one entry to work for multiple architectures can actually simplify things.