Search code examples
cmultithreadingzephyr-rtos

Zephyr - Issue to create a thread - MPU FAULT, Data Access Violation


I'm currently working on a BLE Central using Zephyr and a nRF52840 DK.

I try to log the RSSI of the current connection every second. At first, I called my function at the end of the connected callback and it worked well but after 30 second, the connection was close by the peer. I think my function to retrieve and log the RSSI value blocked the main thread, effectively blocking the pairing (and the services discovery) and so the peripheral closed the connection.

So I'm thinking to create a thread at the end of the connected callback and delegating RSSI registration to it. I will abort this thread at the beginning of the disconnected callback. However, when I'm creating this thread, the application crash due to a Data Access Violation.

[00:00:04.303,436] <err> os: ***** MPU FAULT *****
[00:00:04.303,466] <err> os:   Data Access Violation
[00:00:04.303,466] <err> os:   MMFAR Address: 0x58
[00:00:04.303,466] <err> os: r0/a1:  0x00000000  r1/a2:  0x20005000  r2/a3:  0x00000800
[00:00:04.303,497] <err> os: r3/a4:  0x00000058 r12/ip:  0x00000000 r14/lr:  0x000260f3
[00:00:04.303,497] <err> os:  xpsr:  0x410d0000
[00:00:04.303,497] <err> os: Faulting instruction address (r15/pc): 0x00026074
[00:00:04.303,527] <err> os: >>> ZEPHYR FATAL ERROR 19: Unknown error on CPU 0
[00:00:04.303,588] <err> os: Current thread: 0x20001088 (unknown)
[00:00:04.378,540] <err> os: Halting system

I used addr2line to get the faulting instruction and surprisingly, it's not in directly my code but inside of Zephyr itself.

Here's how I create the thread and the functions it will call. (I've shortened the code and remove some part):

#define THREAD_STACK_SIZE_RSSI 2048
#define THREAD_PRIORITY_RSSI 1

static void thread_entrypoint_rssi(void *p1, void *p2, void *p3);

uint16_t conn_handle;
struct k_thread *thread_data_rssi;
k_tid_t thread_id_rssi;
K_THREAD_STACK_DEFINE(thread_stack_rssi, THREAD_STACK_SIZE_RSSI);

static void connected_cb(struct bt_conn *conn, uint8_t conn_err)
{
  if (conn_err) { /* Some error handling */ }

  if (conn == default_conn)
  {
    int err = bt_conn_set_security(conn, BT_SECURITY_L3);
    if (err) { /* Some error handling */ }

    // Setting the discovering parameters and start to discover
    ...
    err = bt_gatt_discover(conn, &discover_params);
    if (err) { /* Some error handling */ }

    // HERE IS MY PROBLEM
    thread_id_rssi = k_thread_create(thread_data_rssi,
        thread_stack_rssi,
        THREAD_STACK_SIZE_RSSI,
        thread_entrypoint_rssi,
        conn,
        NULL,
        NULL,
        K_PRIO_PREEMPT(1),
        0,
        K_NO_WAIT);

    // This works for 30 seconds (after what the peripheral close the connection)
    // thread_entrypoint_rssi(conn);
  }
}

static bool read_conn_rssi(int8_t *rssi)
{
  struct net_buf *buf, *rsp = NULL;
  struct bt_hci_cp_read_rssi *cp;
  struct bt_hci_rp_read_rssi *rp;

  int err;

  buf = bt_hci_cmd_create(BT_HCI_OP_READ_RSSI, sizeof(*cp));
  if (!buf) { /* Some error handling returning false */ }

  cp = net_buf_add(buf, sizeof(*cp));
  cp->handle = sys_cpu_to_le16(conn_handle);

  err = bt_hci_cmd_send_sync(BT_HCI_OP_READ_RSSI, buf, &rsp);
  if (err) { /* Some error handling returning false */ }

  rp = (void *)rsp->data;
  *rssi = rp->rssi;

  net_buf_unref(rsp);
  return true;
}

static bool set_conn_handle(struct bt_conn *conn)
{
  int ret = bt_hci_get_conn_handle(conn, &conn_handle);

  if (ret)
    return false;

  return true;
}

static void thread_entrypoint_rssi(void *p1, void *p2, void *p3)
{
  struct bt_conn *conn = (struct bt_conn *)p1;
  int8_t rssi = 0xFF;

  if (!set_conn_handle(conn)) { /* Some error handling */ }

  while(1)
  {
    if (!read_conn_rssi(&rssi))
      break;
    LOG_WRN("RSSI = %d", rssi);
    k_sleep(K_SECONDS(1));
  }
}

static void disconnected_cb(strcut bt_conn *conn, uint8_t reason)
{
  k_thread_abort(thread_id_rssi);
  ...
}

Changing the priority of my thread to a negative value doesn't change anything.

Here's some of the config in the prj.conf associate with this project:

CONFIG_BT=y
CONFIG_BT_DEVICE_NAME="BLE CENTRAL TEST"
CONFIG_BT_CENTRAL=y
CONFIG_BT_GATT_CLIENT=y
CONFIG_CONSOLE=y
CONFIG_GPIO=y
CONFIG_SERIAL=y
CONFIG_EVENTS=y

CONFIG_HW_STACK_PROTECTION=y
# Need to fine tune these values, for now they are random
CONFIG_MAIN_STACK_SIZE=100500
CONFIG_HEAP_MEM_POOL_SIZE=10000

CONFIG_BT_ATT_TX_COUNT=255
CONFIG_BT_BUF_ACL_TX_COUNT=255

CONFIG_BT_CTLR=y
CONFIG_BT_CTLR_CONN_RSSI=y

CONFIG_BT_SMP=y
CONFIG_BT_BONDABLE=y
...

I already tried to double the stack size in the prj.conf or the define for the thread's stack thinking that can be a stack overflow but it didn't change anything. I tried to define the thread with the macro K_THREAD_DEFINE() but then it didn't even compile (Saying that initializer element is not constant).


Solution

  • According to Zephyr docs, your k_thread_create arguments are incorrect. The first argument in particular, thread_data_rssi, is an uninitialized pointer.

    You should have:

    struct k_thread thread_data_rssi;
    ...
    thread_id_rssi = k_thread_create(&thread_data_rssi,
    ...
    

    Also, the third argument should be K_THREAD_STACK_SIZEOF(thread_stack_rssi), but I suspect it really is the first argument which causes the crash.