Search code examples
c#algorithmbit-manipulationbit

Create new mask


I am looking for an optimal method that will, based on a list of ulong, look through pairs of 4 bits and if any of these bits is set to 1, it is supposed to set the first bit in the new ulong value to 1 otherwise to 0.

for example

List qwords = new List() {0x1, 0x1, 0x1} ulong myMask = 0;

1 Qword 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001

2 Qword 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001

3 Qword 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 Now checking all 4 bit pairs from each qword in turn we should get the following value.

New Value: 000100000000000000010000000000000001

myMask = 4295032833 (0x100010001)

enter image description here

The example given with the list is just to make it easier to understand how the code logic works. Can anyone help me how to solve this optimally?

enter image description here


Solution

  • As I understand this problem, we compute for every nibble whether any of its bits are set, and then we "compress" that to get a mask indicating which nibble have at least one set bit.

    The first step is easy:

    for (int i = 0; i < qwords.Length; i++)
    {
        ulong bits = qwords[i];
        bits |= bits >> 1;
        bits |= bits >> 2;
        qwords[i] = bits;
    }
    

    I'm modifying the original list here but you are of course welcome to put that intermediate result somewhere else.

    Then if we are allowed to use System.Runtime.Intrinsics.X86 and if BMI2 is supported, we can do the "compression" like this (for a list of 4 qwords):

    ulong m = 0x1111111111111111;
    ulong myMask = Bmi2.X64.ParallelBitExtract(qwords[0], m) |
        (Bmi2.X64.ParallelBitExtract(qwords[1], m) << 16) |
        (Bmi2.X64.ParallelBitExtract(qwords[2], m) << 32) |
        (Bmi2.X64.ParallelBitExtract(qwords[3], m) << 48);
    

    Otherwise, using a method compress (further below) we can do this:

    ulong myMask = compress(qwords[0]) |
        (compress(qwords[1]) << 16) |
        (compress(qwords[2]) << 32) |
        (compress(qwords[3]) << 48);
    

    We could compress like this:

    ulong compress(ulong x)
    {
        x = (x & 0x0101010101010101) | ((x & 0x1010101010101010) >> 3);
        x = (x & 0x0003000300030003) | ((x & 0x0300030003000300) >> 6);
        x = (x & 0x0000000F0000000F) | ((x & 0x000F0000000F0000) >> 12);
        x = (x & 0x00000000000000FF) | ((x & 0x000000FF00000000) >> 24);
        return x;
    }
    

    For lists of up to 4 qwords:

    ulong myMask = 0;
    for (int i = 0; i < qwords.Length; i++)
    {
        ulong bits = qwords[i];
        bits |= bits >> 1;
        bits |= bits >> 2;
        myMask |= compress(bits) << (i * 16);
    }