Search code examples
cassemblyexceptioncortex-mfaulthandler

Recover from Hard Fault on Cortex M0+


Until now I had a Hard fault handler in C that I defined in the vector table:

.sect ".intvecs"

.word _top_of_main_stack
.word _c_int00
.word NMI  
.word Hard_Fault
.word Reserved
.word Reserved  
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
.word Reserved
....
....
....

One of our tests triggers a hard fault (on purpose) by writing to a non existing address. Once the test is done, the handler returns to the calling function and the cortex recovers from the fault. Worth mentioning that the handler does not have any arguments.

Now I'm in the phase of writing a real handler. I created a struct for the stack frame so we can print PC, LR, and xPSR in case of a fault:

typedef struct
{
    int     R0              ;  
    int     R1              ;  
    int     R2              ;  
    int     R3              ;  
    int     R12             ;
    int     LR              ; 
    int     ReturnAddress   ; 
    int     xPSR            ;

}   InterruptStackFrame_t  ;

My hard fault handler in C is defined:

void Hard_Fault(InterruptStackFrame_t* p_stack_frame)
{
    // Write to external memory that I can read from outside
    /* prints a message containing information about stack frame:
     * p_stack_frame->LR, p_stack_frame->PC, p_stack_frame->xPSR,
     * (uint32_t)p_stack_frame (SP)
     */
}

I created an assembly function:

        .thumbfunc  _hard_fault_wrapper
_hard_fault_wrapper: .asmfunc
    MRS    R0, MSP    ; store pointer to stack frame
    BL     Hard_Fault ; go to C function handler
    POP    {R0-R7}    ; pop out all stack frame
    MOV    PC, R5     ; jump to LR that was in the stack frame (the calling function before the fault)

.endasmfunc

This is the right time to say that I don't have an OS, so I do not have to check bit[2] of LR because I definitely know that I use MSP and not PSP.

The program compiles and runs properly and I used JTAG to ensure that all registers restore to the wanted values. When executing the last command (MOV PC, R5) the PC returns to the correct address, but at some point, the debugger indicates that the M0 is locked in a hard fault and cannot recover.

I do not understand the difference between using a C function as a handler or an assembly function that calls a C function.

Does anyone know what is the problem?

Eventually, I will use an assert function that will stuck the processor, but I want it to be optional and up to my decision.


Solution

  • To explain "old_timer"'s comment:

    When entering an exception or interrupt handler on the Cortex the LR register has a special value.

    Normally you return from the exception handler by simply jumping to that value (by writing that value to the PC register).

    The Cortex CPU will then automatically pop all the registers from the stack and it will reset the interrupt logic.

    When directly jumping to the PC stored on the stack however you will destroy some registers and you don't restore the interrupt logic.

    Therefore this is not a good idea.

    Instead I'd do something like this:

        .thumbfunc  _hard_fault_wrapper
    _hard_fault_wrapper: .asmfunc
        MRS    R0, MSP
        B      Hard_Fault
    

    EDIT

    Using the B instruction may not work because the "distance" allowed for the B instruction is more limited than for the BL instruction.

    However there are two possibilities you could use (unfortunately I'm not sure if these will definitely work).

    The first one will return to the address that had been passed in the LR register when entering the assembler handler:

        .thumbfunc  _hard_fault_wrapper
    _hard_fault_wrapper: .asmfunc
        MRS    R0, MSP
        PUSH   {LR}
        BL     Hard_Fault
        POP    {PC}
    

    The second one will indirectly do the jump:

        .thumbfunc  _hard_fault_wrapper
    _hard_fault_wrapper: .asmfunc
        MRS    R0, MSP
        LDR    R1, =Hard_Fault
        MOV    PC, R1
    

    EDIT 2

    You cannot use LR because it holds EXC_RETURN value. ... You have to read the LR from stack and you must clean the stack from the stack frame, because the interrupted program doesn't know that a frame was stored.

    According to the Cortex M3 manual you must exit from an exception handler by writing one of the three EXC_RETURN values to the PC register.

    If you simply jump to the LR value stored in the stack frame you remain in the exception handler!

    If something stupid happens during the program the CPU will assume that an exception happened inside the exception handler and it hangs.

    I assume that the Cortex M0 works the same way as the M3 in this point.

    If you want to modify some CPU register during the exception handler you can modify the stack frame. Thc CPU will automatically pop all registers from the stack frame when you are writing the EXC_RETURN value to the PC register.

    If you want to modify one of the registers not present in the stack frame (such as R5) you can directly modify it in the exception handler.

    And this shows another problem of your interrupt handler:

    The instruction POP {R0-R7} will set registers R4 to R7 to values that do not match the program that has been interrupted. R12 will also be destroyed depending on the C code. This means that in the program being interrupted these four registers suddenly change while the program is not prepared for that!