Search code examples
c++x86ssesimdintrinsics

Setting last or first n bits in SSE register


How can I create a __m128i having the n most significant bits set (in the entire vector)? I need this to mask portions of a buffer that are relevant for a computation. If possible, the solution should have no branches, but this seems hard to achieve

How can I do this ?


Solution

  • You can use one of the methods from this question to generate a mask with the MS n bytes set to all ones. You would then just need to fix up any remaining bits when n is not a multiple of 8.

    I suggest trying something like this:

    - init vector A = all (8 bit) elements to the residual mask of n % 8 bits
    - init vector B = mask of n / 8 bytes using one of the above-mentioned methods
    - init vector C = mask of (n + 7) / 8 bytes using one of the above-mentioned methods
    - result = A | B & C
    

    So for example if n = 36:

    A = f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0
    B = ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00
    C = ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00
    ==> ff ff ff ff f0 00 00 00 00 00 00 00 00 00 00 00
    

    This would be branchless, as required, but it's probably of the order of ~10 instructions. There may be a more efficient method but I would need to give this some more thought.