java multithreading concurrency java-memory-model

Happens before and program order in Java Memory Model

I have some question regarding program order and how it affects reorderings in the JMM.

In the Java Memory Model, program order (po) is defined as the total order of actions in each thread in a program. According to the JLS, this induces happens-before (hb) edges:

If x and y are actions of the same thread and x comes before y in program order, then hb(x, y) (i.e. x happens-before y).

So for a simple program P:

  initially, x = y = 0
     T1     |     T2
 -----------|-----------
  1. r1 = x | 3. r2 = y  
  2. y = 1  | 4. x = r2

I think po(1, 2) and po(3, 4). Thus, hb(1, 2) and hb(3, 4).

Now suppose I wanted to reorder some of these statements, giving me P':

  initially, x = y = 0
     T1     |     T2
 -----------|-----------
  2. y = 1  | 3. r2 = y  
  1. r1 = x | 4. x = r2

According to this paper, we can reorder any two adjacent statements (e.g. 1 and 2), provided that the reordering doesn't eliminate any transitive happens-before edges in any valid execution. However, since hb is defined (partially) by po, and po is a total order over a thread's actions, it seems to me that it would be impossible to reorder any two statements without violating hb, thus P' is not a legal transformation.

My questions are:

Is my understanding of po and hb correct, and have I correctly defined po and hb with respect to the above program P?
Where is my understanding about reordering with regards to hb failing?

Solution

You're missing this part of the JLS:

It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal.

In your case, since 1 and 2 are unrelated, they can be flipped. Now if 2 had been y = r1, then 1 must happen before 2 for the right result.

The real problem occurs with multi-processor execution. Without any happen-before boundaries, T2 may observe 2 happening before 1, regardless of execution order.

This is because of CPU caching. Let's say T1 executed 1 and 2, in any order. Since no happen-before boundary exist, these actions are still in CPU cache, and depending on other needs, the part of the cache containing the result of 2 may be flushed before the part of the cache that contains the result of 1.

If T2 executes between those two cache flush events, it'll observe 2 has happened and 1 hasn't, i.e. 2 happened before 1, as far as T2 knows.

If this is not allowed, then T1 must establish a happens-before boundary between 1 and 2.

In Java there are various ways of doing that. The old style would be to put 1 and 2 into separate synchronized blocks, because the start and end of a synchronized block is a happens-before boundary, i.e. any action before the block happens before actions inside the block, and any action inside the block happens before actions coming after the block.