Search code examples
cconcurrencylinux-kernellockingspinlock

Is calling spin_lock_irqsave, instead of local_irq_disable followed by spin_lock, the same for a per-prorcessor struct?


Consider the following kernel code

local_irq_disable();
__update_rq_clock(rq);
spin_lock(&rq->lock);

rq is a pointer to a per-processor struct (i.e; not subject to SMP concurrency). Since rq will never be accessed in an another place after calling local_irq_disable (because rq is used by only a single processor and disabling local interrupts means no interrupt handlers will run on that CPU), then what is the point of embedding __update_rq_clock between the previous functions? In other words, what difference does it make from the following, which disables interrupts and takes the lock in a single call, given that rq is safe in both cases inside __update_rq_clock either locked or not?

spin_lock_irqsave(&rq->lock, flags);
__update_rq_clock(rq);

Solution

  • First and foremost: the two examples you show have different semantics: local_irq_disable does not save the old state of IRQs. In other words, when the corresponding local_irq_enable function gets called, it will forcibly re-enable IRQs (whether they were already disabled or not). On the other hand, spin_lock_irqsave does save the old IRQ state, so it can later be restored through spin_unlock_irqrestore. For this reason, the two pieces of code you show are very different, and it doesn't make much sense to compare them.

    Now, coming to the real problem:

    Since rq will never be accessed in an another place after calling local_irq_disable (because rq is used by only a single processor and disabling local interrupts means no interrupt handlers will run on that CPU)

    This is not always true. There isn't a "magic barrier" which stops CPUs from accessing another CPU's per-CPU data. It is still possible, and in such case extra care must be taken by means of a proper locking mechanism.

    While per-CPU variables are usually meant to provide fast access to an object for a single CPU, and therefore can have the advantage of not requiring locking, there is nothing other than convention that keeps processors from digging around in other processors' per-CPU data (quote).

    Runqueues are a great example of this: since the scheduler often needs to migrate tasks from one runqueue to another, it certainly will need to access two runqueues at the same time at some point. Indeed, this is probably one of the reasons why struct rq has a .lock field.

    In fact, doing an rq clock update without holding rq->lock seems to be considered a bug in recent kernel code, as you can see from this lockdep assertion in update_rq_clock():

    void update_rq_clock(struct rq *rq)
    {
        s64 delta;
    
        lockdep_assert_held(&rq->lock);
    
        // ...
    

    It feels like the statements you show in your first code snippet should be re-ordered to lock first and then update, but the code is quite old (v2.6.25), and the call to __update_rq_clock() seems to be deliberately made before acquiring the lock. Hard to tell why, but maybe the old runqueue semantics did not require locking in order to update .lock/.prev_clock_raw, and thus the locking was done afterwards just to minimize the size of the critical section.