Search code examples
cssesimdintrinsics

shuffling upper 32 bits with lower 32 bits in m128


I'm working with C intrinsics (SSE/SSE2 only) right now, and i have a m128 value with 4 floats in it. Are there any possibility of shifting / shuffling / moving the most upper 32 bits to most lower 32 bits?

Example : I have {1.0f, 2.0f, 3.0f, 4.0f} in m128 and i want to make {4.0f, 2.0f, 3.0f, 1.0f} out of it. (the values in beetween may be erased).


Solution

  • You can do that via shufps xmm, xmm, imm8 instruction, with which you can statically select which input word should be stored for each output word.

    #include <stdio.h>
    #include <xmmintrin.h>
    
    int main(void) {
        float array[4] = {1.0f, 2.0f, 3.0f, 4.0f};
        __m128 data;
        printf("before : %.1f %.1f %.1f %.1f\n", array[0], array[1], array[2], array[3]);
        data = _mm_loadu_ps(array);
        data = _mm_shuffle_ps(data, data, 0x27);
        _mm_storeu_ps(array, data);
        printf("after  : %.1f %.1f %.1f %.1f\n", array[0], array[1], array[2], array[3]);
        return 0;
    }