I want to know what is the equivalent instruction/code to SSE instruction in Neon instruction.
__m128i a,b,c;
c = _mm_packs_epi32(a, b);
Packs the 8 signed 32-bit integers from a and b into signed 16-bit integers and saturates.
I checked the equivalent instruction on ARM site but I didn't find any equivalent instruction. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204j/Bcfjicfj.html
There is no instruction that directly does what you want, but all the building blocks to build one are there:
The saturation/narrow instruction is:
int16x4_t vqmovn_s32 (int32x4_t)
This intrinsic saturates from signed 32 bit to signed 16 bit integers, returning the four narrowed integers in a 64 bit wide variable.
Combining these into your _mm_packs_epi32 is easy: Just do it for a and b, and combine the results:
int32x4_t a,b;
int16x8_t c;
c = vcombine_s16 (vqmovn_s32(a), vqmovn_s32(b));
You may have to swap the order of the vcombine_s16 arguments.