I have setup an example on Compiler Explorer:
int global;
static inline int return3()
{
int ret;
__asm__("mov %0, 3"
: "=g" (ret));
return ret;
}
void f(int x)
{
global = return3();
}
The compiler (GCC 12.2) outputs this:
f:
mov eax, 3
mov DWORD PTR global, eax
ret
global:
.zero 4
I don't understand why the assembled code uses eax
. The return3
function is inline and the asm statement has the g
constraint (which means register, memory or constant), so why can't it generate this?:
f:
mov DWORD PTR global, 3
ret
global:
.zero 4
Interestingly, in this test, with the m
constraint, the generated code first allocates space on the stack and saves the 3
there, then moves it to eax
and only then moves it to global
:
f:
sub esp, 16
mov DWORD PTR [esp+12], 3
mov eax, DWORD PTR [esp+12]
mov DWORD PTR global, eax
add esp, 16
ret
global:
.zero 4
GCC1 inline asm "m"
or "g"
operands only ever use the address of the actual C object you used, not some other object it's later assigned to. In your case that's the local variable ret
.
If the inline asm took the address of its output operand (using lea
), it should only get the address of global
if you used "=g"(global)
. Which does in fact happen: https://godbolt.org/z/YYjzs5P3n
If the asm stored multiple different temporary values in that operand, a signal handler shouldn't be able to see them in global
. (I know it's not volatile
so that's a bit questionable, but do note that at least GNU C de-facto supports rolling your own atomics using asm
so a cautious approach makes sense.)
If the asm
also accessed global
by name, it would be a problem if the value of global
had already changed from using %0
as scratch space. That wouldn't actually be safe here because there's no "memory"
clobber to tell the compiler this asm
statement might read or write a C variable that isn't an explicit operand.
Perhaps you could call it a missed-optimization that GCC doesn't pick memory when there's no "memory"
clobber, but there are cases where it needs to not do that. The compiler internals are probably complicated enough. (It's only a missed optimization if you discount the other factors, like taking the address, or possibly inventing extra writes to the non-atomic / non-volatile global
.)
Footnote 1 Clang "g"
or "=g"
does sometimes invent a temporary and then copy, which is weird.
https://godbolt.org/z/ochK7GzWf isn't quite that, just showing that it always picks memory for g
, and with Intel syntax in fact shows it inventing mov [esp], 3
, ambiguous operand-size because it doesn't print memory operands with dword ptr
the way GCC does. Intel-syntax inline asm support is new for clang.
TODO: cook up an example of clang's "g"
inventing a temporary for input or output, e.g. when it had the value of a global in a register, it might spills it to the stack instead of to its permanent address IIRC.
Often it's a good idea to write "r,m"
multi-alternative constraints for clang if you want to try to give it (or GCC) the option of picking memory. Clang will pick r
for an input at least. But in this case "=r,m"
didn't help.