Search code examples
c++bit-manipulationintrinsicsinstructions

Gather/Extract first Bit from Integer Array


Problem

Does an instruction exist that gathers/extracts the first bit of an int[32] and stores it into a int?

  • I know about the intrinsic pext but that is not really what I want.

  • I do have a code for that but I thought maybe there a designated instruction.

  • The ints array is zero besides the first bit. Ergo, no masking needed.

Code

void ints2bits(int &bits, int *ints) {
   bits = (ints[0] << 0) + (ints[1] << 1) + ... + (ints[31] << 31);
}

UPDATE & FEEDBACK:

Just tested harold suggestions. It works very well and I can attain nice speed up.


Solution

  • There is no single instruction that can even read that much data, but groups of 4 (8 with AVX2) can be processed quickly with the use of _mm_movemask_ps. Ignore the fact that it claims to be a floating point instruction, it just gathers and appends the 4 top bits.

    Of course moving the bottom bit into the top is easy with _mm_slli_epi32.

    So putting it together (not tested)

    int res = 0;
    for (int i = 0; i < 32; i += 4) {
        __m128i x = _mm_load_si128((__m128i*)&ints[i]); // I assume it's aligned
        x = _mm_slli_epi32(x, 31);
        int bits = _mm_movemask_ps(_mm_castsi128_ps(x));
        res += bits << i;
    }
    

    The extension to AVX2 is pretty obvious.

    An other possible approach is shifting each lane by a variable amount (pre-AVX2 this requires multiplication) and then summing, first vertically of course, saving the horizontal sum for last. This is likely slower, and certainly more awkward.