multithreading concurrency parallel-processing thread-safety

What is "Linearizability"?

Can anyone out there help me understand what Linearizability is? I need an explanation that is simple and easy to understand. I'm reading The Art of Multiprocessor Programming by Maruice Herilihy and Nir Shavit and am trying to understand Chapter 3 about Concurrent Objects.

I understand that a method is Linearizable if it has a point where it seems to "take effect" instantaneously from the point of view of the other threads. That makes sense, but it is also said that Linearizability is actually a property of the execution history. What does it mean for an execution history to be Linearizable, why do I care, and how does it relate to a method or object being Linearizable?

Solution

Well, I think I can answer this question concisely.

When we are going to tell whether a concurrent object is correct, we always try to find a way to extend the partial order to a total order.

we can recognize whether a sequential object is correct much more easily.

First, Let's put the concurrent object aside. We will discuss it later. Now let's consider a sequential history H_S, a sequential history is a sequence of events(i.e Invokes and Responses) of which each Invoke is followed instantaneously by its corresponding Response.(Ok, "instantaneously" could be confused, consider the execution of a single-threaded program, of course there's an interval between each Invoke and its Response, but the methods are executed one by one. So "instantaneously" means that no other Invoke/Response can stick into a pair of Invoke_i~Response_i)

The H_S may looks like:

H_S : I1 R1 I2 R2 I3 R3 ... In Rn
(Ii means the i-th Invoke, and Ri means the i-th Response)

It will be very easy to reason about the correctness of the history H_S, because there isn't any concurrency, what we need to do is to check whether the execution works as well as what we expect(meet the conditions of the sequential specification). In other words, is a legal sequential history.

Ok, the reality is we are working with a concurrent program. For example, we run two threads A and B in our program. Every time we run the program, we will get a history H_C(History_Concurrent) of execution. We need to consider a method call as Ii~Ri as above in H_S. Of course, there must be a lot of overlaps between method calls called by thread A and thread B. But each event(i.e Invokes and Responses) has it's real-time order. So the Invokes and Response of all the methods called by A and B can be mapped to a sequential order, the order may look like:

H_C : IA1 IB1 RA1 RB1 IB2 IA2 RB2 RA2

The order seems to be confused, it's just the sort of the events of each method calls:

thread A:         IA1----------RA1               IA2-----------RA2
thread B:          |      IB1---|---RB1    IB2----|----RB2      |
                   |       |    |    |      |     |     |       |
                   |       |    |    |      |     |     |       |
real-time order:  IA1     IB1  RA1  RB1    IB2   IA2   RB2     RA2
                ------------------------------------------------------>time

And we got the H_C. So how could we check whether the execution of H_C is correct? We can reorder the H_C to a H_RO following the rule:

RULE: If one method call m1 precedes another m2, then the m1 must precedes m2 in the reordered sequence. (It means that if Ri is in front of Ij in the H_C, you must guarantee that Ri is still in front of Ij in the reordered sequence, i and j doesn't have their orders , we can also use a, b, c...) We say that H_C is equivalent to H_RO(history_reorder) under such rule.

The H_RO will have 2 properties:

It respects the program order.
It preserves the real-time behavior.

Reorder the H_C without violating the rule above, we can get some sequential histories(which are equivalent to H_C), For example:

H_S1: IA1 RA1 IB1 RB1 IB2 RB2 IA2 RA2
H_S2: IB1 RB1 IA1 RA1 IB2 RB2 IA2 RA2
H_S3: IB1 RB1 IA1 RA1 IA2 RA2 IB2 RB2
H_S4: IA1 RA1 IB1 RB1 IA2 RA2 IB2 RB2

However, we cannot get a H_S5:

H_S5: IA1 RA1 IA2 RA2 IB1 RB1 IB2 RB2

because IB1~RB1 wholly precedes IA2~RA2 in the H_C, it cannot be reordered.

Now, with these sequential histories, how could we confirm whether our execution history H_C is correct?(I highlight the history H_C, it means that we just take care about correctness of history H_C now rather than the correctness of the concurrent program)

The answer is straightforward, if at least one of the sequential histories is correct(a legal sequential history meet the conditions of the sequential specification), then the history H_C is linearizable, we call the legal H_S the linearization of the H_C. And H_C is a correct execution. In other words, it is a legal execution that we expected. If you have experience on concurrent programming, you must have written such program which sometimes looks quite well but sometimes be wholly wrong.

Now we have known what is a linearizable history of a concurrent program's execution. So what about the concurrent program itself?

The basic idea behind linearizability is that every concurrent history is equivalent, in the following sense, to some sequential history. [The Art of Multiprocessor Programming 3.6.1 : Linearizability] ("following sense" is the reorder rule I have talked about above)

Ok, the reference may be a little confused. It means that, if every concurrent history has a linearization(legal sequential history equivalent to it), the concurrent program meets the conditions of linearizability.

Now, we have understood what is Linearizability. If we say our concurrent program is linearizable, in other words it has the property of linearizability. It means that everytime we run it, the history is linearizable(the history is what we expect).

So it is obvious that linearizability is a safety(correctness) property.

However, the method of reordering all the concurrent histories to sequential history to judge whether a program is linearizable is only possible in principle. In Practice, we are faced with thousands of method calls called by double-digit threads. We cannot reorder all the histories of them. We cannot even list all of the concurrent histories of a trivial program.

The usual way to show that a concurrent object implementation is linearizable is to identify for each method a linearization point where the method takes effect. [The Art of Multiprocessor Programming 3.5.1 : Linearization Points]

We will discuss about the question under the conditions of "concurrent object". It is substantially as same as above. An implementation of a concurrent object has some methods to access the data of the concurrent object. And multi-threads will share a concurrent object. So when they access the object concurrently by calling the object's methods, the implementor of the concurrent object must ensure the correctness of the concurrent method calls.

He will identify for each method a linearization point. The most important thing is to understand the meaning of linearization point. The statement of "where the method takes effect" is really hard to understand. I have some examples:

First, let's look at a wrong case:

//int i = 0; i is a global shared variable.
int inc_counter() {
    int j = i++;
    return j;
}

It's quite easy to find the error. We can translate i++ into:

#Pseudo-asm-code
Load   register, address of i
Add    register, 1
Store  register, address of i

So two threads can execute one "i++;" concurrently and the result of i appears to be increased only once. We could get such a H_C:

thread A:         IA1----------RA1(1)                  IA2------------RA2(3)
thread B:          |      IB1---|------RB1(1)    IB2----|----RB2(2)    |
                   |       |    |        |        |     |     |        |
                   |       |    |        |        |     |     |        |
real-time order:  IA1     IB1  RA1(1)  RB1(1)    IB2   IA2   RB2(2)   RA2(3)
                ---------------------------------------------------------->time

Whatever you try to reorder the real-time order, you must not find a legel sequential history that equivalent to H_C.

We should rewrite the program:

//int i = 0; i is a global shared variable.
int inc_counter(){
    //do some unrelated work, for example, play a popular song.
    lock(&lock);
    i++;
    int j = i;
    unlock(&lock);
    //do some unrelated work, for example, fetch a web page and print it to the screen.
    return j;
}

OK, what is the linearization point of inc_counter()? The answer is the whole critical section. Because when a lot of threads repeatedly call the inc_counter(), the critical section will be executed atomically. And it can guarantee the correctness of the method. The Response of the method is the incremented value of global i. Consider the H_C like :

thread A:         IA1----------RA1(2)                 IA2-----------RA2(4)
thread B:          |      IB1---|-------RB1(1)    IB2--|----RB2(3)    |
                   |       |    |        |         |   |     |        |
                   |       |    |        |         |   |     |        |
real-time order:  IA1     IB1  RA1(2)  RB1(1)     IB2 IA2   RB2(3)   RA2(4)

Obviously, the equivalent sequential history is legal:

IB1 RB1(1) IA1 RA1(2) IB2 RB2(3) IA2 RA2(4)  //a legal sequential history

We have reorder the IB1~RB1 and IA1~RA1, because they're overlapped in the real time order, they can be reordered ambigously. In case of the H_C, we can reason that the critical section of the IB1~RB1 is entered first.

The example is too simple. Let's consider another one:

//top is the tio
void push(T val) {
    while (1) {
        Node * new_element = allocate(val);
        Node * next = top->next;
        new_element->next = next;
        if ( CAS(&top->next, next, new_element)) {  //Linearization point1
            //CAS success!
            //ok, we can do some other work, such as go shopping.
            return;
        }
        //CAS fail! retry!
    }
}

T pop() {
    while (1) {
        Node * next = top->next;
        Node * nextnext = next->next;
        if ( CAS(&top->next, next, nextnext)) { //Linearization point2
            //CAS succeed!
            //ok, let me take a rest.
            return next->value;
        }
        //CAS fail! retry!
    }
}

It's a lock-free stack algorithm full of bugs! but don't take care of the details. I just want to show the linearization point of push() and pop(). I have shown them in the comments. Consider many threads repeatedly call push() and pop(), they will be ordered at the CAS step. And other steps seem to be of no importance because whatever they concurrently executed, the final effect they will take on the stack(precisely top variable) is due to the order of the CAS step(linearization point). If we can make sure that the Linearization point really work, then the concurrent stack is correct. The picture of the H_C is too long, but we can confirm that there must be a legal sequential equivalent to H_C.

So if you are implementing a concurrent object, how to tell the correctness of you program? You should identify each method a linearization points and think carefully(or even prove) they will always hold the invariants of the concurrent object. Then the partial order of all the method calls can be extended to at least one legal total order(sequential history of events) that meet the sequential specification of the concurrent object.

Then you can say your concurrent object is correct.