Search code examples
cx86simdavxavx2

Best way to mask a single bit in AVX2?


For example, with an input ymm vector x and bit index i I want an output vector with only the ith bit kept and everything else zeroed.

With AVX512 k registers, I could write the following, but AVX2 and below doesn't have k registers, so what do you think is the best way to do it?

__m512i m512i_maskBit(__m512i x, unsigned i) {
    __mmask8 m = _cvtu32_mask8(1u << i / 64);
    __m512i vm = _mm512_maskz_set1_epi64(m, 1ull << i % 64);
    return _mm512_and_si512(x, vm);
}

Solution

  • Here is an approach using variable shifts (just creating the mask):

    __m256i create_mask(unsigned i) {
        __m256i ii = _mm256_set1_epi32(i);
        ii = _mm256_sub_epi32(ii,_mm256_setr_epi32(0,32,64,96,128,160,192,224));
        __m256i mask = _mm256_sllv_epi32(_mm256_set1_epi32(1), ii);
        return mask;
    }
    

    _mm256_sllv_epi32 (vpsllvd) was introduced by AVX2 and it shifts each 32 bit element by a variable amount of bits. If the (unsigned) shift-amount is bigger than 31 (i.e., also for signed negative numbers), the corresponding result is 0.

    Godbolt link with small test code: https://godbolt.org/z/a5xfqTcGs