I am trying to understand different use cases for some of the kernel synchronization mechanisms (sequential locks vs RCU (Read-Copy-Update) vs Per-CPU locks) are recommended to be used in writing your device driver or kernel module. Any examples would be appreciated.
Sequential locks this is clever approach to locking, where writers acquire spinlocks and readers can avoid locking altogether, at the cost of having to repeat an inconsistent read. This approach is most suitable in cases where data is read often, but seldom updated. Here multiple reads don't incur side effects, which is similar to any reader-writer locking and updates wont confuse readers at the same time. Search the kernel code which uses/includes the API's/headers/data structures:
#include <linux/seqlock.h>
typedef struct {
unsigned seq; ===> seq is incremented every-time a writer acquires a lock
spinlock_t lock;
} seqlock_t;
write_seqlock(), write_sequnlock()
read_seqlock(), read_sequnlock()
Note: seq no is incremented every time a writer acquires a lock, readers, record a copy of the seq number, then perform a read, re-examine the sequence number (with read_seqretry()), if the seq number is not consistent, then the reader must re-read. For a contested reader, the redundant read's are no worse than "spinning" the CPU where as for uncontested reader, the spinlock can be avoided all together.
RCU (Read-Copy-Update) this separates update and reclamation information, where both readers and writers can avoid locking altogether. RCU is mostly used while dealing with dynamically allocated data structures, such as linked lists. RCU writer's does not modify the data in place, but instead allocates a new element which it initializes with the updated data.
PER-CPU Variables these are mostly used with CPU specific structures can avoid global locks. Note these must still synchronize with ISR's. Similarly:
#include <linux/percpi.h>
DEFINE_PER_CPU()
per_cpu(var,cpu)
get_cpu_var(), put_cpu_var()