Consider the following kernel code
local_irq_disable();
__update_rq_clock(rq);
spin_lock(&rq->lock);
rq
is a pointer to a per-processor struct
(i.e; not subject to SMP concurrency). Since rq
will never be accessed in an another place after calling local_irq_disable
(because rq
is used by only a single processor and disabling local interrupts means no interrupt handlers will run on that CPU), then what is the point of embedding __update_rq_clock
between the previous functions? In other words, what difference does it make from the following, which disables interrupts and takes the lock in a single call, given that rq
is safe in both cases inside __update_rq_clock
either locked or not?
spin_lock_irqsave(&rq->lock, flags);
__update_rq_clock(rq);
First and foremost: the two examples you show have different semantics: local_irq_disable
does not save the old state of IRQs. In other words, when the corresponding local_irq_enable
function gets called, it will forcibly re-enable IRQs (whether they were already disabled or not). On the other hand, spin_lock_irqsave
does save the old IRQ state, so it can later be restored through spin_unlock_irqrestore
. For this reason, the two pieces of code you show are very different, and it doesn't make much sense to compare them.
Now, coming to the real problem:
Since
rq
will never be accessed in an another place after callinglocal_irq_disable
(becauserq
is used by only a single processor and disabling local interrupts means no interrupt handlers will run on that CPU)
This is not always true. There isn't a "magic barrier" which stops CPUs from accessing another CPU's per-CPU data. It is still possible, and in such case extra care must be taken by means of a proper locking mechanism.
While per-CPU variables are usually meant to provide fast access to an object for a single CPU, and therefore can have the advantage of not requiring locking, there is nothing other than convention that keeps processors from digging around in other processors' per-CPU data (quote).
Runqueues are a great example of this: since the scheduler often needs to migrate tasks from one runqueue to another, it certainly will need to access two runqueues at the same time at some point. Indeed, this is probably one of the reasons why struct rq
has a .lock
field.
In fact, doing an rq clock update without holding rq->lock
seems to be considered a bug in recent kernel code, as you can see from this lockdep assertion in update_rq_clock()
:
void update_rq_clock(struct rq *rq)
{
s64 delta;
lockdep_assert_held(&rq->lock);
// ...
It feels like the statements you show in your first code snippet should be re-ordered to lock first and then update, but the code is quite old (v2.6.25), and the call to __update_rq_clock()
seems to be deliberately made before acquiring the lock. Hard to tell why, but maybe the old runqueue semantics did not require locking in order to update .lock
/.prev_clock_raw
, and thus the locking was done afterwards just to minimize the size of the critical section.