multithreading synchronization locking mutex atomic

Understanding issues with atomic lock operations in case of multi processors

In case of uniprocessor, we disable interrupts before performing a lock operation (Lock acquire, Lock release) to prevent context switching, then after the operation we re-enable it.

But, in the case of multi-processor CPU, just disabling interrupts is not sufficient to make the lock operations atomic.

I read from a source that, "It happens as each processor has a cache, and they can write to the same memory even with the interrupts being disabled."

Q1. Why this even matters in case of atomic lock operation?

Q2. What are the other issues that arise while implementing lock operations in multi-processor environment with only disabling the interrupts?

Solution

Only disabling interrupts is insufficient, as the threads running on multiprocessors can still access the data structures and codes inside the functions of synchronization objects at the same time, hence atomicity can not be achieved by just disabling the interrupts.

For example, let L be an LOCK object and L.status is "FREE" and a X is a process that has four threads T1, T2, T3, T4 and each of them are running on separate processors P1, P2, P3, P4.

Let's assume the pseudo code for LOCK::acquire() is as following,

 LOCK::acquire(){
       if(status==BUSY){
           lock.waitList.add(RunningThread);
           TCB t == readyList.remove(); 
           thread_switch(RunningThread,t);
           t.state=running;    

        }
       else{
          status=BUSY; 
       }


}

If we disable only the interrupts, the codes of T1,T2,T3,T4 can still run on the corresponding processors. Let's assume that the lock is free at one moment.

If, all the threads try to acquire the lock-L at the same time, it is possible that they might end up checking the status of the lock at the same time , and in that case each of the threads will find status=="FREE", and every threads will acquire the lock, which would eliminate the applicability of the current locks implementation.

That is why, different atomic operations, such as test_and_set are used when implementing lock objects for multi processors. These atomic operations would allow only one thread from one multiprocessor access lock's codes at a time.