Consider this example:
volatile unsigned int x;
unsigned int y;
void f() {
x /= 2;
}
void g() {
y /= 2;
}
When compiled with -Os, clang-6.0 produces on x64 for both f and g the same shrl <offset>(%rip)
instruction pattern (See https://godbolt.org/g/hUPprL), while gcc-7.3 produces this (See https://godbolt.org/g/vMcKVV) for f():
mov 0x200b67(%rip),%eax # 601034 <x>
shr %eax
mov %eax,0x200b5f(%rip) # 601034 <x>
Is this just a missed optimization or is there a justification for gcc to reject shrl <offset>(%rip)
in case of volatile access? Who is wrong?
This is just a missed optimization by gcc. Both implementations preserve the read from and write to x
precisely, and thus are correct.
"Under the hood" operating on a memory operand performs the same loads and stores as the longer implementation.