I was looking into the spinlock code of kernel code (version 3.10.1), and didnt understand one thing.
When acquiring the spinlock through the function spin_lock_bh()
, it goes ahead and calls preempt_disable()
. This is the same as other spinlock functions which is used to acquire, for example spin_lock()
and spin_lock_irq()
.
But when releasing the lock through spin_unlock_bh()
, it calls preempt_enable_no_resched()
, which skips calling the scheduler to preempt.
That is not the case for the other corresponding release functions (like spin_unlock()
and spin_unlock_irq()
). They call the regular preempt_enable()
function which calls __schedule()
.
local_bh_disable()
increments preempt_count
counter by a specific value, also preempt_disable()
increments it by 1
. That's what __raw_spin_lock_bh()
does.
preempt_enable()
function (which is invoked from __raw_spin_unlock()
and __raw_spin_unlock_irq()
) invokes preempt_check_resched()
. But there is no need to try to schedule when preemption is still disabled. It will be done inside _local_bh_enable_ip()
on function exit.
Looking at source code you can see that the real "BH" spinlock call sequence is:
spin_release(&lock->dep_map, 1, _RET_IP_);
do_raw_spin_unlock(lock);
preempt_enable_no_resched();
\____barrier();
\____dec_preempt_count(); // <--- decrease counter, but we can't schedule here
local_bh_enable_ip();
\____sub_preempt_count() // <--- real disabling preemption
\____preempt_check_resched(); // <--- schedule
But f.e. "IRQ" spinlock call sequence:
spin_release(&lock->dep_map, 1, _RET_IP_);
do_raw_spin_unlock(lock);
local_irq_enable();
preempt_enable();
\____barrier();
\____dec_preempt_count(); // <--- real disabling preemption
\____barrier();
\____preempt_check_resched(); // <--- schedule
To sum up: in case of BH-spinlock it just bypasses preempt_check_resched()
because it's not needed.