Take the following piece of code:
#include <stdbool.h>
bool global;
bool foo(void) {
if (global) {
global = false;
return true;
}
return false;
}
Is this code theoretically equal to the following?
#include <stdbool.h>
bool global;
bool baz(void) {
bool tmp = global;
global = false;
return tmp;
}
I am inclined to think that foo
and baz
are theoretically equivalent: global
is not volatile and there is nothing else forcing the compiler to consider multithreading (something like atomic_bool
, for example). So I would think that if a branch is way more expensive than a store, then the compiler would opt to go with the baz
solution instead of the foo
one. But I cannot get the compiler to actually do this, taking the GCC 13.2.0 RISC-V compiler with -std=gnu2x -march=rv32if -mabi=ilp32f -O3 -mbranch-cost=2000
still results in foo
. On the other hand, I am also unable to make the compiler turn baz
into foo
.
So am I wrong and is there a theoretical difference between foo
and baz
? Or is this just an optimization GCC (RISC-V?) does not take?
The two functions differ in a multithreaded environment. Consider what happens if there is a thread that reads global
from time to time and foo
or baz
is called while global
contains false
. In this situation, foo
will not store to global
but baz
will.
C 2018 5.1.2.4 4 defines conflict:
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
C 2018 51.2.4 35 says:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
This means the behavior with baz
is undefined, since the modification to global
(Note 2 in C 2018 3.1 3 tells us “‘Modify’ includes the case where the new value being stored is the same as the previous value.”) in the thread calling baz
conflicts with the read in the other thread.
However, the behavior with foo
remains defined in this situation, as it does not modify global
. (Since the behavior of baz
is undefined, the compiler could generate the same code for it that it does for foo
.)
Note this difference in the semantics of the routines does not preclude the compiler from generating the same code for them. In a single thread, the two routines have the same semantics as defined by the C standard:
global
contains false
and the return value is true
.Therefore, regardless of which routine is called, the single-thread effects on observable behavior and abstract machine state are identical.