I need to load 8 bit array and then convert every element to 32-bit integer using armv8a neon inline asm code. I have done it with armv7 but no idea how to do it in v8a...
The code I used in v7 is
"pld [%1, #128] \n"
"vld1.u8 {d0,d1}, [%1]! \n"
"vmovl.u8 q8, d0 \n"
"vmovl.u8 q9, d1 \n"
"vmovl.u16 q0, d16 \n"
"vmovl.u16 q1, d17 \n"
"vmovl.u16 q2, d18 \n"
"vmovl.u16 q3, d19 \n"
How can I finish this by using armv8a neon code? Or how can I convert the code above to armv8a? PS: In my case, I only need inline asm but not intrinsics...
Thanks for the help.
For unsigned elements, USHLL
, USHLL2
with the shift number 0 will do the job.
ld1 {v0.16b}, [%1], #16
USHLL v16.8h, v0.8b, #0
USHLL2 v17.8h, v0.16b, #0
USHLL v0.4s, v16.4h, #0
USHLL2 v1.4s, v16.8h, #0
USHLL v2.4s, v17.4h, #0
USHLL2 v3.4s, v17.8h, #0
For signed elements - guess guess - use SSHLL
and SSHLL2
instead.
Similarly, there is no direct equivalent to MOVN
on aarch64
as well.
--EDIT
There are XTN/XTN2
instructions that wore exactly like VMOVN
on the other hand.