Search code examples
cgccinline-assembly

Purpose of '*&x' in inline assembly?


Looking at some x86_64 GCC inline assembly, I have come across the following construct:

int x;
__asm__( "<opcode> %0" : "=m" (*&x) );
                               ^^^

This seems to be some quirk to make the compiler do... something. But on the C level, it's a no-op, so I am wondering... what is this reference-dereference for, and isn't there a less quirky way to achieve the same?


Solution

  • As far as I know, it doesn't serve any functional purpose at all. It would be functionally equivalent, and more concise, to simply write "=m" (x).

    There could be a certain logic to writing "=m" (*&x) or "m" (*&x) as a stylistic choice. Often when using inline assembly with an m operand, you already have a pointer to the memory to be used, and so then you need to use "m" (*ptr). This can be a little counterintuitive, since after all it is the value of ptr that will get used within the asm code. So it's easy to get this wrong and write "m" (ptr) instead, which the compiler won't catch as there are no type constraints on asm operands. Thus the author might have decided to always prefer to write a memory operand in the form "m" (*expr), and so in code where we have the variable x instead of a pointer to it, following this rule would lead to writing "m" (*&x).

    (The compiler would refuse to compile "=m" (&x) because &x is not an lvalue, but if it were an input operand instead, "m" (&x) could result in the compiler materializing the address of x in stack memory, which is of course not what is wanted. Current versions of gcc will treat "m" (&x) as an error, but in older versions it was a non-fatal warning, so perhaps in even older versions it was silently accepted.)

    On the other hand, in the commit that introduced this code, there are other instances where the "m" (x) form is used. Perhaps the commit contains code written by multiple authors, or a single author who simply didn't follow their own conventions consistently.


    Regarding some of the explanations suggested in other comments and answers:

    • Generally in C, *&expr is equivalent to expr whenever expr is an lvalue. (I would have thought there would be an SO question to link here, but the closest I could find is Reference/Dereference Identities which is more about &*expr.) The main difference is that *&x must emit a diagnostic if x is declared register. I doubt that was a motivation here, because there's no reason to think that any future programmer would change the code to register int x;. Moreover, gcc will already complain about "m" (x) when x is declared register, just as it will for &x: see https://godbolt.org/z/a5sf4d9qE.

    • It isn't necessary to write *&x to force x to be allocated in memory, as opposed to a register, because the m constraint already does this. That's the whole point of the constraint system in gcc inline assembly.

      Technically, asm("frob %0" : "=m" (x)) could instead allocate some temporary stack space as the operand for frob, and then copy that space into x afterwards, so x itself could theoretically still live in a register. But generating such code wouldn't be wrong, merely sub-optimal. Normally the compiler would optimize out any such copy, and just pass the address of x itself into the asm, as you would expect.

      Anyhow, for gcc, merely taking the address of a variable does not in itself force the variable to be allocated in memory, if the address is not used in an essential way; see for instance https://godbolt.org/z/sE8Yqx6cG. Indeed, with a modern optimizing compiler and the as-if rule, the very notion of a variable "living" in any particular place becomes fuzzy and ill-defined. So if somehow it were vitally important for x itself to be in memory (which is certainly not the case for the code at hand), then just replacing x by *&x would not suffice; I think you'd have to use volatile.

    • I don't think it's likely that *&x was chosen to work around some compiler bug. The code in question is from 2001 and gcc was quite mature by then. If there were some bug, you'd think they'd have left a comment explaining the situation. Anyway, as noted above, there are other instances within the very same commit where just x is used.

    • I don't agree with a previous answer (which has now been deleted) suggesting that "=m" (x) would potentially cause undefined behavior by reading an indeterminate value. The = indicates that this is an output operand, so it does not read the value of x at all, no more than the expression x = 5; would do.

      It was suggested in a comment that "as far as the C language is concerned, asm looks like a function call, so it doesn't look like an assignment." Although the syntax of gcc asm may be vaguely similar to that of a function call, the semantics are not. And since asm is a gcc extension, and not even part of the standard C language, it does not make sense to try to apply the C standard to its syntax or semantics. The governing document would instead be the gcc manual, which makes it quite clear that a variable associated to an output-only operand is written and not read.