In ARM64 compilers with GCC-like __asm__
, how could I make use of multi-vector NEON types like uint8x16x4_t
?
uint8x16x4_t Meow()
{
uint8x16x4_t result;
__asm__(
"meow %0"
: "=w"(result));
return result;
}
That results in the following assembly output:
meow v0
Is there a way to get it to be something like this?:
meow { v0.16b - v3.16b }
Or even better, refer to the individual parts somehow.
You'll have to do it manually, but you can do so with the T
, U
and V
modifiers. And suffixes can just be specified literally. The following code:
uint8x16x4_t Meow()
{
uint8x16x4_t result;
__asm__(
"meow { %0.16b, %T0.16b, %U0.16b, %V0.16b }"
: "=w"(result));
return result;
}
gives me:
Meow:
meow { v4.16b, v5.16b, v6.16b, v7.16b }
mov v1.16b, v5.16b
mov v2.16b, v6.16b
mov v3.16b, v7.16b
mov v0.16b, v4.16b
ret