Search code examples
carmmicrocontrollerthumblpc

Atomic test-and-set for LPC1788 microcontroller


I'm working with the NXP LPC1788 microcontroller and I'm developing a multi-threaded application in C. In part of my application I define a custom linked-list data structure. I was previously having problems with my program due to concurrent access to a particular list which I seem to have solved by implementing a "lock acquire" method and a "lock release" method for lists which threads can call before accessing the list itself.

I did this by adding a 'sema' data member to the list struct:

typedef struct linked_list
{
  list_node_t *head;
  list_node_t *tail;
  uint32_t len;
  NODE_ITEM_TYPE_T itemType;
  uint32_t itemSize;
  uint8_t sema;
} linked_list_t;

My "lock acquire" method is given below:

void LIST_AcquireLock(linked_list_t *list)
{
  while(list->sema);
  list->sema = 1;
}

My "lock release" method is given below:

void LIST_ReleaseLock(linked_list_t *list)
{
  list->sema = 0;
}

Generally this seems to work okay, since my application involves adding and removing items to a list like this thousands of times a second and I have not noticed any bugs related to concurrent access since.

However, to be more confident that this works, I was wondering if there was any way of implementing a test-and-set approach. The LPC1788 relies on a version of the Thumb instruction set specific to Cortex-M3 microcontrollers, which can be found here or in the user manual on page 918+.

Looking through it, though, I can't find anything like a test-and-set instruction. I might just be overlooking it.

Ideally, I would like to have something like this:

void LIST_AcquireLock(linked_list_t *list)
{
  do{
    while(list->sema);
  } while(TestAndSet(list->sema));
}

EDIT

Based on Nemo's answer, I've attempted the following:

void LIST_AcquireLock(linked_list_t *list)
{
  // Wait until lock seems free.
  while(list->sema);

  // Make sure lock is actually free.
  do {

    // If the semaphore is locked, we continue.
    // OTHERWISE we try to lock it ourselves.
    if(__LDREXB(&(list->sema))) continue;

    // If __STREXB returns 1, then another thread might have accessed that
    // memory location and we can't be sure the lock operation is atomic,
    // so try the locking procedure again.
  } while(__STREXB(1, &(list->sema)));
}

If it's helpful, this is the corresponding assembly code:

LIST_AcquireLock:
??LIST_AcquireLock_0:
       0x56de: 0x7d01         LDRB      R1, [R0, #0x14]
       0x56e0: 0x2900         CMP       R1, #0
       0x56e2: 0xd1fc         BNE.N     ??LIST_AcquireLock_0    ; 0x56de
??LIST_AcquireLock_1:
       0x56e4: 0xf110 0x0114  ADDS.W    R1, R0, #20             ; 0x14
       0x56e8: 0xe8d1 0x1f4f  LDREXB    R1, [R1]
       0x56ec: 0xb2c9         UXTB      R1, R1
       0x56ee: 0x2900         CMP       R1, #0
??LIST_AcquireLock_2:
       0x56f0: 0xf110 0x0114  ADDS.W    R1, R0, #20             ; 0x14
       0x56f4: 0x2201         MOVS      R2, #1
       0x56f6: 0xe8c1 0x2f43  STREXB    R3, R2, [R1]
       0x56fa: 0x2b00         CMP       R3, #0
       0x56fc: 0xd1f2         BNE.N     ??LIST_AcquireLock_1    ; 0x56e4
       0x56fe: 0x4770         BX        LR

I'm having trouble reproducing the concurrent access issues (assuming it was concurrency issues I was having) so I don't know for sure that this works.


Solution

  • ARM uses a "load-linked/store-exclusive" paradigm for atomic operations. See this question and Section 39.2.4.8 of the user manual you linked for details.

    [Update]

    Based on the code in the link @HansPassant provided, I would suggest some slight changes to your routine:

    void LIST_AcquireLock(linked_list_t *list)
    {
      // Wait until lock seems free.
      //while(list->sema); // unnecessary
    
      // Make sure lock is actually free.
      do {
    
        // If the semaphore is locked, we continue.
        // OTHERWISE we try to lock it ourselves.
        if(__LDREXB(&(list->sema))) continue;
    
        // If __STREXB returns 1, then another thread might have accessed that
        // memory location and we can't be sure the lock operation is atomic,
        // so try the locking procedure again.
      } while(__STREXB(1, &(list->sema)));
    
      // Ensure CPU does not reorder any memory accesses across lock acquisition.
      __DMB();
    }
    

    The __DMB() is probably irrelevant on very simple ARM cores, but it is definitely needed on more complex ones. Modern CPUs have complicated memory models.