Search code examples
armssesimdneon

finding a neon instruction corresponding to sse instruction


I want to know what is the equivalent instruction/code to SSE instruction in Neon instruction.

__m128i a,b,c;
c = _mm_packs_epi32(a, b);

Packs the 8 signed 32-bit integers from a and b into signed 16-bit integers and saturates.

I checked the equivalent instruction on ARM site but I didn't find any equivalent instruction. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204j/Bcfjicfj.html


Solution

  • There is no instruction that directly does what you want, but all the building blocks to build one are there:

    The saturation/narrow instruction is:

    int16x4_t vqmovn_s32 (int32x4_t)  
    

    This intrinsic saturates from signed 32 bit to signed 16 bit integers, returning the four narrowed integers in a 64 bit wide variable.

    Combining these into your _mm_packs_epi32 is easy: Just do it for a and b, and combine the results:

      int32x4_t a,b;
      int16x8_t c;
    
      c = vcombine_s16 (vqmovn_s32(a), vqmovn_s32(b));
    

    You may have to swap the order of the vcombine_s16 arguments.