As described in other posts, without any sort of volatile
or std::atomic
qualification, the compiler and/or processor is at liberty to re-order the sequence of statements (e.g. assignments):
// this code
int a = 2;
int b = 3;
int c = a;
b = c;
// could be re-ordered/re-written as the following by the compiler/processor
int c = 2;
int a = c;
int b = a;
However, are conditionals and control statements (e.g. if
, while
, for
, switch
, goto
) also allowed to be re-ordered, or are they considered essentially a "memory fence"?
int* a = &previously_defined_int;
int b = *a;
if (a == some_predefined_ptr)
{
a = some_other_predefined_ptr; // is this guaranteed to occur after "int b = *a"?
}
If the above statements could be re-ordered (e.g. store a
in a temporary register, update a
, then populate b
by de-reference the "old" a
in the temporary register), which I guess they could be while still satisfying the same "abstract machine" behavior in a single-threaded environment, then why aren't there problems when using locks/mutexes?
bool volatile locked = false; // assume on given implementation, "locked" reads/writes occur in 1 CPU instruction
// volatile so that compiler doesn't optimize out
void thread1(void)
{
while (locked) {}
locked = true;
// do thread1 things // couldn't these be re-ordered in front of "locked = true"?
locked = false;
}
void thread2(void)
{
while (locked) {}
locked = true;
// do thread2 things // couldn't these be re-ordered in front of "locked = true"?
locked = false;
}
Even if std::atomic
was used, non-atomic statements can still be re-ordered around atomic statements, so that wouldn't help ensure that "critical section" statements (i.e. the "do threadX things") were contained within their intended critical section (i.e. between the locking/unlocking).
Edit: Actually, I realize the lock example doesn't actually have anything to do with the conditional/control statement question I asked. It would still be nice to have clarification on both of the questions asked though:
The "as-if" rule ([intro.abstract]) is important to note here:
The semantic descriptions in this document define a parameterized nondeterministic abstract machine. This document places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below
Anything* can be reordered so long as the implementation can guarantee that the observable behavior of the resulting program is unchanged.
Thread synchronization constructs in general cannot be implemented properly without fences and preventing reordering. For example, the standard guarantees that lock/unlock operations on mutexes shall behave as atomic operations. Atomics also explicitly introduce fences, especially with respect to the memory_order
specified. This means that statements depending on an (unrelaxed) atomic operation cannot be reordered, otherwise the observable behavior of the program may change.
[intro.races] talks a great deal about data races and ordering.