Search code examples
operating-systemsynchronizationlockingmutual-exclusion

Bounded-waiting Mutual Exclusion with test and set


I am reading the famous Operating System Concepts book of (Avi Silberschatz, Peter Baer Galvin, Greg Gagne) edition 9: http://codex.cs.yale.edu/avi/os-book/OS9/

In the process synchronization chapter, there is an algorithm for "Bounded-waiting Mutual Exclusion with test_and_set" as follow:

do {
    waiting[i] = true;
    key = true;  // <-- Boolean variable that I do not see its utility
    while (waiting[i] && key) // <-- the value of the key variable here is always true
        key = test_and_set(&lock); // <-- it might become false here, but what is the point?
    waiting[i] = false;

    /* critical section */

    j = (i + 1) % n;
    while ((j != i) && !waiting[j]) 
        j = (j + 1) % n; 
    if (j == i) 
        lock = false; 
    else
        waiting[j] = false;

    /* remainder section */
} while (true); 

I can't see the role of the boolean variable key (used in the 3rd, 4th and 5th lines of the code above), I see that we can remove it without any particular effect on the algorithm, am I right or I have missed something?


Solution

  • You could simplify the algorithm to:

    do {
        waiting[i] = true;
        while (waiting[i] && test_and_set(&lock)) ;
        waiting[i] = false;
    
        /* critical section */
    
        j = (i + 1) % n;
        while ((j != i) && !waiting[j]) 
            j = (j + 1) % n; 
        if (j == i) 
            lock = false; 
        else
            waiting[j] = false;
    
        /* remainder section */
    } while (true);
    

    and it would be the exact same. I guess the authors used key because they thought it would make the code easier to read.

    As asked in the comments:

    Generally, when using test_and_set you simply do while(test_and_set(&lock)) ;. However in this case you want to ensure that the thread only waits a bounded amount of time for the lock. This is accomplished with the waiting array. At the end of the critical section when we unlock, we do not simply set lock to false which is how you typically unlock it. Instead we try to find the next thread that is waiting for the lock. By next, I mean incrementing the thread ID and then looping around when we hit n (the j = (j + 1) % n; part). If such a thread j is found we set waiting[j] to false instead of lock.

    This prevents scenarios where 2 or more threads are constantly grabbing the lock while another thread or group of threads are always waiting. For example, say 3 threads are waiting for the same lock (threads 0, 1 and 2). Say thread 0 releases the lock and then thread 1 grabs it. While thread 1 has the lock thread 0 then tries to grab the lock again and when thread 1 releases the lock thread 0 grabs it instead of thread 2. This could repeat indefinitely and thread 2 never gets the lock.

    In this bounding wait algorithm by using the waiting array this behavior cannot occur. If three threads are constantly grabbing the lock the next thread in terms of thread ID will go next, e.g. thread 0 will grab and release the lock followed by thread 1 and then followed by thread 2. This is because each thread is waiting for either lock or its entry in the waiting array to become false. If another thread is waiting for the lock when a thread is about to release the lock it sets the waiting entry instead of lock releasing only that thread from the spin wait. This prevents the pathological case of one or more threads waiting indefinitely for the lock.