I load the data like this:
ld1 {v8.8h, v9.8h, v10.8h, v11.8h}, [%8], #64
But when I use the data to calculate, it goes wrong:
smlal v16.4s, v8.2d[0], v0.h[0]
The error is:
/tmp/cc2h1F9Y.s:523: Error: operand 2 must be a SIMD vector register -- `smlal v16.4s,v8.2d[0],v0.h[0]'
So I want to know how to get the half 64bit of Vn.8h in armv8 like D register in armv7?
This is multiply a scalar with a vector.
You can use the smlal2 instruction to solve this.
smlal v16.4s, v8.4h, v0.h[0]
This will multiply the low 64bit.
smlal2 v16.4s, v8.8h, v0.h[0]
And this will multiply the high 64bit.
For those who confused by this...