What is the best way to copy a 32bits array into 16bits arrays?
I know that "memcpy" uses hardware instruction.But is there a standard function to copy arrays with "changing size" in each element?
I use gcc for armv7 (cortex A8).
uint32_t tab32[500];
uint16_t tab16[500];
for(int i=0;i<500;i++)
tab16[i]=tab32[i];
On ARM cortex A8 with Neon instruction set, the fastest methods use interleaved read/write instructions:
vld2.16 {d0,d1}, [r0]!
vst1.16 {d0}, [r1]!
or saturating instructions to convert a vector of 32-bit integers to a vector of 16-bit integers.
Both of these methods are available in c using gcc intrinsic. It's also possible that gcc can autovectorize a carefully written c-code to use nothing but these particular instructions. This would basically require that there's a one to one correspondence with all the side effects of these instructions and the c code.