I'm working with the NXP LPC1788 microcontroller and I'm developing a multi-threaded application in C. In part of my application I define a custom linked-list data structure. I was previously having problems with my program due to concurrent access to a particular list which I seem to have solved by implementing a "lock acquire" method and a "lock release" method for lists which threads can call before accessing the list itself.
I did this by adding a 'sema' data member to the list struct:
typedef struct linked_list
{
list_node_t *head;
list_node_t *tail;
uint32_t len;
NODE_ITEM_TYPE_T itemType;
uint32_t itemSize;
uint8_t sema;
} linked_list_t;
My "lock acquire" method is given below:
void LIST_AcquireLock(linked_list_t *list)
{
while(list->sema);
list->sema = 1;
}
My "lock release" method is given below:
void LIST_ReleaseLock(linked_list_t *list)
{
list->sema = 0;
}
Generally this seems to work okay, since my application involves adding and removing items to a list like this thousands of times a second and I have not noticed any bugs related to concurrent access since.
However, to be more confident that this works, I was wondering if there was any way of implementing a test-and-set approach. The LPC1788 relies on a version of the Thumb instruction set specific to Cortex-M3 microcontrollers, which can be found here or in the user manual on page 918+.
Looking through it, though, I can't find anything like a test-and-set instruction. I might just be overlooking it.
Ideally, I would like to have something like this:
void LIST_AcquireLock(linked_list_t *list)
{
do{
while(list->sema);
} while(TestAndSet(list->sema));
}
EDIT
Based on Nemo's answer, I've attempted the following:
void LIST_AcquireLock(linked_list_t *list)
{
// Wait until lock seems free.
while(list->sema);
// Make sure lock is actually free.
do {
// If the semaphore is locked, we continue.
// OTHERWISE we try to lock it ourselves.
if(__LDREXB(&(list->sema))) continue;
// If __STREXB returns 1, then another thread might have accessed that
// memory location and we can't be sure the lock operation is atomic,
// so try the locking procedure again.
} while(__STREXB(1, &(list->sema)));
}
If it's helpful, this is the corresponding assembly code:
LIST_AcquireLock:
??LIST_AcquireLock_0:
0x56de: 0x7d01 LDRB R1, [R0, #0x14]
0x56e0: 0x2900 CMP R1, #0
0x56e2: 0xd1fc BNE.N ??LIST_AcquireLock_0 ; 0x56de
??LIST_AcquireLock_1:
0x56e4: 0xf110 0x0114 ADDS.W R1, R0, #20 ; 0x14
0x56e8: 0xe8d1 0x1f4f LDREXB R1, [R1]
0x56ec: 0xb2c9 UXTB R1, R1
0x56ee: 0x2900 CMP R1, #0
??LIST_AcquireLock_2:
0x56f0: 0xf110 0x0114 ADDS.W R1, R0, #20 ; 0x14
0x56f4: 0x2201 MOVS R2, #1
0x56f6: 0xe8c1 0x2f43 STREXB R3, R2, [R1]
0x56fa: 0x2b00 CMP R3, #0
0x56fc: 0xd1f2 BNE.N ??LIST_AcquireLock_1 ; 0x56e4
0x56fe: 0x4770 BX LR
I'm having trouble reproducing the concurrent access issues (assuming it was concurrency issues I was having) so I don't know for sure that this works.
ARM uses a "load-linked/store-exclusive" paradigm for atomic operations. See this question and Section 39.2.4.8 of the user manual you linked for details.
[Update]
Based on the code in the link @HansPassant provided, I would suggest some slight changes to your routine:
void LIST_AcquireLock(linked_list_t *list)
{
// Wait until lock seems free.
//while(list->sema); // unnecessary
// Make sure lock is actually free.
do {
// If the semaphore is locked, we continue.
// OTHERWISE we try to lock it ourselves.
if(__LDREXB(&(list->sema))) continue;
// If __STREXB returns 1, then another thread might have accessed that
// memory location and we can't be sure the lock operation is atomic,
// so try the locking procedure again.
} while(__STREXB(1, &(list->sema)));
// Ensure CPU does not reorder any memory accesses across lock acquisition.
__DMB();
}
The __DMB()
is probably irrelevant on very simple ARM cores, but it is definitely needed on more complex ones. Modern CPUs have complicated memory models.