I've noticed that gcc12 does not optimize these two functions to the same code (with -O3
):
int x = 0;
void f(bool a)
{
if (a) {
++x;
}
}
void f2(bool a)
{
x += a;
}
Basically no transformation is done. That can be seen here: https://godbolt.org/z/1G3n4fxEK
Optimizing f
to the code in f2
seems to be trivial and no jump would be needed anymore. However, I'm curious if there's a reason why this is not done by gcc? Is it somehow still slower or something? I would assume it's never slower and sometimes faster, but I might be wrong.
Thanks
Such a substitution would be incorrect in a scenario where one thread calls f(1)
while another thread calls f(0)
. If x
is never actually accessed outside the first thread, there would be no race condition in the code as written, but the substitution would create one. If x
is initially 1, nothing would prevent the code from being processed as:
This would cause x
to be left holding the value 1 when thread 2 has just written the value 2
. Worse than that, if the function was invoked within a context like:
x = 1;
f(1);
if (x != 1)
launch_nuclear_missiles_if_x_is_1_and_otherwise_make_coffee();
a compiler might recognize that x
will always equal 2 following the return from f(1)
, and thus make the function call unconditional.
To be sure, such substitution would rarely cause problems in real-world situations, but the Standard explicitly forbids transformations that could create race conditions where none would exist in the source code as written.