Search code examples
c++simdintrinsicsavxavx2

AVX2: Is there a way to implement _mm256_mul_epi8 function for a constant power of 2?


I would like to implement the following operation on 8 bit elements:

_a = _b * 8 + _c

with vectors. For the plus there is obviously _mm256_add_epi8 but i was not able to find a _mm256_mul_epi8 or something to multiply with 8 bit elements. I also tried to find a function to left shift by 3, but no luck.

Thanks for helping!


Solution

  • You can do this with only add:

    __m256i _b2 = _mm256_add_epi8(_b,_b);
    __m256i _b4 = _mm256_add_epi8(_b2,_b2);
    __m256i _b8 = _mm256_add_epi8(_b4,_b4);
    __m256i _a = _mm256_add_epi8(_b8,_c);
    
    

    You can also do this with any shift, if you mask out high bits of each byte to emulate shifting out:

    // not needed if _b values are smaller than 32
    __m256i _b_low = _mm256_and_si256(_b,_mm256_set1_epi8(0x1F));
    
    __m256i _b8 = _mm256_slli_epi32(_b_low,3);
    __m256i _a = _mm256_add_epi8(_b8,_c);