Following this manual I wanted to create simplest inline AVR assembly snippet possible: copy values of two variables to two other variables.
uint8_t a, b, c, d;
a = 42;
b = 11;
asm(
"mov %0, %2\n\t"
"mov %1, %3\n\t"
: "=r" (c), "=r" (d)
: "r" (a), "r" (b)
);
I would expect it to be equivalent to:
uint8_t a, b, c, d;
a = 42;
b = 11;
c = a;
d = b;
However, after running both values of c
and d
are equal to 42. If I change the asm snipptet to:
asm(
"mov %0, %3\n\t"
"mov %1, %2\n\t"
: "=r" (c), "=r" (d)
: "r" (a), "r" (b)
);
c
is equal to 11 and d
is equal to 42 as expected. Similarly, changing both source operands to %2
yields two 42 and setting both of them to %3
yields two 11.
Why the first version does not work as intended?
I would expect it to be equivalent to:
uint8_t a, b, c, d;
a = 42;
b = 11;
c = a;
d = b;
No, it's not1. The reason is that in the C code, one assignment follows after the other, whereas in inline asm, the compiler treats the "code" as if it happens at once. The compiler does not analyze the code in the asm string template in any way, it's just a string on which it performs replacements of %-operands. In
asm ("mov %0, %3" "\n\t"
"mov %1, %2"
: "=r" (c), "=r" (d)
: "r" (a), "r" (b));
the lifetime of a
and b
ends at the asm, and the lifetime of c
and d
begins. Therefore, it's totally fine for the compiler to use the same register for, say c
and a
. This means the output of the 1st move overrides the input of the 2nd move. This is the classic early-clobber situation, and you'll have to tell this fact to the compiler by means of early-clobber modifier &
:
asm ("mov %0, %3" "\n\t"
"mov %1, %2"
: "=&r" (c), "=r" (d)
: "r" (a), "r" (b));
However, the code that's generated is sub-optimal because it's actually fine if the compiler uses the same register for c
and b
, and the same register for d
and a
. This means you don't need any explicit asm code at all, and everything can be described by means of the constraints:
asm (""
: "=r" (c), "=r" (d)
: "1" (a), "0" (b));
1Apart from that, your asm code tries to implement c = b
and d = a
, not c = a
and d = b
.