Search code examples
embeddedesp32freertos

What does following error means in esp32?


I was working on esp32 MQTT. When I publish a message from cloud to microcontroller I received an MQTT message based on the msg program do process after the process complete I m sending the acknowledgement using MQTT. After sending acknowledgement the esp getting crash. So, I want to know what does this error means? What will the possible reason that I m getting the error?

DEBUG: [mqtt.c:800:handleMqttPayload]  ------------------------>line  
DEBUG: [mqtt.c:472:_mqttSubscriptionCallback]  ------------------------>line  
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x400957b6  PS      : 0x00060933  A0      : 0x80085160  A1      : 0x3ffe2120  
0x400957b6: is_free at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/multi_heap.c:380
 (inlined by) multi_heap_malloc_impl at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/multi_heap.c:432

A2      : 0x3ffb9a20  A3      : 0x00000074  A4      : 0x3ffb9bc2  A5      : 0x3ffc24f4  
A6      : 0x00000000  A7      : 0x3ffc1930  A8      : 0x62df42e6  A9      : 0x00003ffb  
A10     : 0x00000001  A11     : 0x00000001  A12     : 0x62df42e6  A13     : 0x3ffb9b98  
A14     : 0x3ffb9bc2  A15     : 0x00000003  SAR     : 0x0000001d  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00003ffb  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

ELF file SHA256: 6ba6c7666cfc3a6affb97ff2c01bc138e861781ba07bfe6d7e01fbf2e790ec91

Backtrace: 0x400957b6:0x3ffe2120 0x4008515d:0x3ffe2140 0x40085456:0x3ffe2160 0x40085671:0x3ffe21a0 0x400817ad:0x3ffe21c0 0x400eeae1:0x3ffe21e0 0x400ef431:0x3ffe2220 0x400ee77f:0x3ffe2240 0x400e2ca4:0x3ffe2270 0x400e2e90:0x3ffe22c0 0x400f2ca9:0x3ffe22e0 0x400f2d18:0x3ffe2300 0x400f3c7a:0x3ffe2320 0x4008fa5d:0x3ffe2350
0x400957b6: is_free at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/multi_heap.c:380
 (inlined by) multi_heap_malloc_impl at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/multi_heap.c:432

0x4008515d: heap_caps_malloc at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/heap_caps.c:232

0x40085456: trace_malloc at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/heap_trace.c:188

0x40085671: __wrap_heap_caps_malloc at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/heap/heap_trace.c:421

0x400817ad: malloc_internal_wrapper at /home/horsemann/Desktop/WorkSpace/TestingRepo/vendors/espressif/esp-idf/components/esp32/esp_adapter.c:407

0x400eeae1: esf_buf_alloc at ??:?

0x400ef431: ic_ebuf_alloc at ??:?

0x400ee77f: ieee80211_getmgtframe at ??:?

0x400e2ca4: ieee80211_encap_null_data at ??:?

0x400e2e90: ieee80211_pm_tx_null_process at ??:?

0x400f2ca9: pm_tx_null_data_done_process at ??:?

0x400f2d18: pm_send_wake_null_cb at ??:?

0x400f3c7a: ppProcTxDone at ??:?

0x4008fa5d: ppTask at ??:?

Solution

  • The call stack is presented in the debug output :

    ppTask calls
    ppProcTxDone calls
    pm_send_wake_null_cb etc.
    

    Since the error is likely to be in your code you should look at the last call in the back-trace that is yours at the place(s) it calls the next function in the stack dump, and verify the validity of any call parameters.

    Another useful information here is the value in the EXCCAUSE (Exception Cause) register 28(0x1C): enter image description here indicating access to an invalid address.

    The location of the exception is:

    0x400957b6: is_free at [...]/espressif/esp-idf/components/heap/multi_heap.c:380
    

    That function looks like this:

    static inline bool is_free(const block_header_t *block)
    {
        return ((block->size & 0x01) != 0);
    }
    

    and the most likely cause of an exception there is if block refers to an invalid location or is null or has invalid alignment when it is dereferenced by block->size. The exception rather suggests the first (or maybe second) of these possibilities.

    That suggests heap corruption, which could have occurred anywhere are any time previously - not necessarily in the code path indicated by the backtrace. It is typically caused by over-running or under-running an allocated heap block and then detected when a new heap operation (malloc, free, new, delete etc.) and tries to interpret the already corrupted heap.

    You need therefore to review your usage of every dynamically allocated block to ensure that you have for example:

    • Allocated an appropriate size in all cases,
    • Have not accessed and modified data beyond the bounds of the allocated size,
    • Have not accessed the memory after it has been free'd / delete'd or otherwise returned to the heap.
    • Do not have a memory leak.

    Aside: "Guru Mediation Error" is a misspelling of "Guru Meditation Error"; it is itself meaningless (a "cute" computing history reference to what was itself a joke, then rendered less cute or funny by misspelling), but it is essentially akin to a kernel panic or BSOD. The critical thing is; an exception occurred.