Search code examples
copenmpatomic

OpenMP and C: Must this operation be made atomic?


Suppose we have a couple of arrays according to the following type declarations

bool B[ N ];
Bar Foo[ M ];
int B_of_Foo[ M ]; // 0 <= B_of_Foo[m] < N

B and Foo may contain arbitrary values (depending on context) while the entries of B_of_Foo contain are restricted to indices between 0 and N-1. Let us take a look at the following code

bool repeat = true

while( repeat ) {

    repeat = false;

    for( int m = 0; m < M; m++ ) {
      if( complicated_condition( Foo[m] ) {
        B [ B_of_Foo[ m ] ] = true
        repeat |= true;
      }
    }

}

complicated_condition is true if some AND or ORs of Bs is true but B[ B_of_Foo[ m ] ] is false. This code is guaranteed to finish when executed sequentially. I want to parallelize this with OpenMP. The variable repeat can be treated by reduction. I wonder whether the update

B [ B_of_Foo[ m ] ] = true

must then be marked as an atomic operation.

I think that any concurrent or duplicated updates will produce the same result. Even if one threads checks complicated_condition with an outdated version of B[ B_of_Foo[ m ] ], its subsequent write operation will not change that entry of B, and the code is stable even if the while-loop is repeated without any update of B.


Solution

  • Yes, the update of B [ B_of_Foo[ m ] ] needs to be atomic (or otherwise serialized), because the result of multiple threads writing to the same variable is unspecified, even if they are writing the same value. From Section 1.4.1 (Memory Model) of the OpenMP 3 standard:

    If multiple threads write without synchronization to the same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. Similarly, if at least one thread reads from a memory unit and at least one thread writes without synchronization to that same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. If a data race occurs then the result of the program is unspecified.

    When starting, most people have in mind a memory model where there's some type of guarantee of sequential consistency - like in the POSIX filesystem - where if two concurrent write operations occur, they behave as if they're serialized with a random one "winning". It doesn't help that that's the sort of example most of use use when teaching about data races. But no such thing is guaranteed, and in principle the result can be complete gibberish. (Whether it ever happens in practice on your favourite architecture is a different question. Here, your values are single bytes and I have to think you'd be ok in most implementations on x86. But nothing is guaranteed.)