multithreading embedded locking interrupt multicore

Guidelines for the use of critical sections and spinlocks to share resources between tasks and ISRs in single and multicore systems

I'm developing an application for a multicore system (ESP32) in which an SPI slave device reads and stores data in the context of an ISR, while a task outputs the read data through the UART. I'm using critical sections and spinlocks as synchronization mechanism.

I have read several posts about this, but I find it complex. I would like to summarize what I have understood, and ask if it is correct. To clarify, I'm not claiming that the text below is correct, my question is if the text below is correct.

I would like to first clarify something about critical sections and spinlocks:

EDIT: ~~Critical sections consist on disabling interrupts.~~ Critical sections are sections of code executed atomically. This can be achieved by disabling interrupts for instance.
Spinlocks consist on acquiring a lock in busy wait, unlike semaphores for instance, which will force a context switch if the lock is not available.
In some APIs for multicore systems, spinlocks do disable interrupts (see https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/system/freertos_idf.html#api-changes)

Finally, about ISRs and tasks, ISRs can interrupt OS tasks, but not the other way arround. Higher priority ISRs can interrupt lower priority ones.

Single core systems

Only a single core, so all tasks and interrupts land on the same CPU.
Spinlocks normally are not appropriate for single-processor systems because the condition that would break a process out of the spinlock can be obtained only by executing a different process. Thus, busy-waiting on a lock that won't be released until the busy-wait finishes and the context is switched, makes no sense. Use mutex or semaphores instead, that force a context switch if the lock cannot be acquired.
EDIT: ~~However, spinlocks could make sense in single-core systems to synchronize with an ISR, as ISRs could interrupt the task waiting on the spinlock and unlock it.~~ As explained by Peter Cordes, this would cause a deadlock. Disable interrupts instead.

Multi core systems

Several cores, so different tasks and/or interrupts can land on different CPUs.
Disabling interrupts is not appropriate for multicore systems because they are disabled only on one of the cores, hence exclussion is not assured.
Spinlocks are appropriate in these systems as a task waiting on a spinlock can be unlocked by another task running in a different core.
If the API's spinlock do not disable interrupts, they must be disabled manually before acquiring the spinlock, as otherwise they will lead to deadlocks (see https://linux-kernel-labs.github.io/refs/pull/189/merge/labs/interrupts.html#background-information).

Am I missing something?

Solution

Terminology: in general, critical section (wikipedia) is a computer-science term to describe code that only runs with mutual exclusion. It's not specific to mutual exclusion created by disabling interrupts, it also applies to using a mutex.

I don't know if there's a subset of the embedded world where "critical section" does specifically mean between disabling and re-enabling interrupts on this core, but you're right that it's only sufficient to protect against other code on the same core (e.g. for a thread-local or per-core variable)

What you're saying looks correct to me, taking into account the meaning you're using for "critical section". Indeed, mutual exclusion without deadlocks between main threads and interrupt handlers on a multi-core system is a tricky problem.

However, spinlocks could make sense in single-core systems to synchronize with an ISR, as ISRs could interrupt the task waiting on the spinlock and unlock it.

You'd need something much more complex than a spinlock for that case! A spinlock implies simple spin-wait, which would deadlock with an ISR waiting for the code it interrupted to run more.

And many ISR contexts need to finish quickly and might be in the middle of talking to hardware, so context-switch to a user-space thread to let it finish wouldn't be what you want anyway. Disabling interrupts or lock-free stuff (like LL/SC e.g. ARM64 ldar / stlr) are widely used on unicore systems.

Lock-free LL/SC code will have to retry if it was interrupted between the Load-Linked and Store-Conditional, but unlike taking a lock, other code can make forward progress on atomic operations on the variable you were starting to modify. (Hence the term lock-free.)

Single-instruction atomics like ARM's old swp (.exchange()) instruction and ARMv8.1 instructions like ldadd (.fetch_add) potentially increases interrupt latency vs. LL/SC which can take an interrupt at any point instead of doing a lot of work in one instruction, although a core could potentially discard partial progress. (The old swp can't implement compare-and-swap so isn't a building-block for lock-free algorithms in general, mostly just for locking, but the new ARMv8.1 instructions include CAS and single-instruction support for many of the C++11 atomic integer ops so those don't require CAS retry loops.)

Lock-free algorithms involving multiple variables can get pretty complex and sometimes be slower than just locking depending on the use-case and number of threads, but are usually excellent for things like updating a single counter. Or a SeqLock is excellent for a counter updated infrequently by a timer interrupt or one thread, read by other threads.