I'm having this code:
uint16_t swap_bytes(uint16_t x)
{
asm volatile(
"eor, %A0, %B0" "\n\t"
"eor, %B0, %A0" "\n\t"
"eor, %A0, %B0" "\n\t"
: "=r" (x)
: "0" (x)
);
return x;
}
Which translates (by avr-gcc version 4.8.1
with -std=gnu99 -save-temps
) to:
.global swap_bytes
.type swap_bytes, @function
swap_bytes:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
/* #APP */
; 43 "..\lib\own\ownlib.c" 1
eor, r24, r25
eor, r25, r24
eor, r24, r25
; 0 "" 2
/* #NOAPP */
ret
.size swap_bytes, .-swap_bytes
But then the compiler is complaining like that:
|65|Error: constant value required|
|65|Error: garbage at end of line|
|66|Error: constant value required|
|66|Error: garbage at end of line|
|67|Error: constant value required|
|67|Error: garbage at end of line|
||=== Build failed: 6 error(s), 0 warning(s) (0 minute(s), 0 second(s)) ===|
The mentioned lines are the ones with the eor
commands. Why does the compiler having problems with that? The registers are even upper (>= r16) ones where nearly all operations are possible. constant value required
sounds to me like it expects a literal... I dont get it.
Just to clarify for future googlers:
eor, r24, r25
has an extra comma after the eor. This should be written as:
eor r24, r25
I would also encourage you (again) to consider using gcc's __builtin_bswap16. In case you are not familiar with the gcc 'builtin' functions, they are functions that are built into the compiler, and (despite looking like functions) are typically inlined. They have been written and optimized by people who understand all the ins and outs of the various processors and can take into account things you may not have considered.
I understand the desire to keep code as small as possible. And I accept that it is possible that (somehow) this builtin on your specific processor is producing sub-optimal code (I assume you have checked?). On the other hand, it may produce exactly the same code. Or it may use some even more clever trick to do this. Or it might interleave instructions from the surrounding code to take advantage of pipelining (or some other avr-specific thing that I have never heard of because I don't speak 'avr').
What's more, consider this code:
int main()
{
return __builtin_bswap16(12345);
}
Your code always takes 3 instructions to process a swap. However with builtins, the compiler can recognize that the arg is constant and compute the value at compile time instead of at run time. Hard to be more efficient than that.
I could also point out the benefits of "easier to support." Writing inline asm is HARD to do correctly. And future maintainers hate to touch it cuz they're never quite sure how it works. And of course, the builtin is going to be more cross-platform portable.
Still not convinced? My last pitch: Even after you fix the commas, your inline asm code still isn't quite right. Consider this code:
int main(int argc, char *argv[])
{
return swap_bytes(argc) + swap_bytes(argc);
}
Because of the way you have written written swap_bytes (ie using volatile
), gcc must compute the value twice (see the definition of volatile). Had you omitted volatile (or if you had used the builtin which does this correctly), it would have realized that argc doesn't change and re-used the output from the first call. Did I mention that correctly writing inline asm is HARD?
I don't know your code, constraints, level of expertise or requirements. Maybe your solution really is the best. The most I can do is to encourage you to think long and hard before using inline asm in production code.