Search code examples
countbinarynumbersoperationsrepresentation

Hint for lookup table set bit count algorithm


I am looking at solution for the set bit count problem (given a binary number, how to efficiently count how many bits are set).

Here, http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive, I have found some methods.

What about the lookup table method? I dont understand what properties of binary representation / number make it work.

static const unsigned char BitsSetTable256[256] = 
{
#   define B2(n) n,     n+1,     n+1,     n+2
#   define B4(n) B2(n), B2(n+1), B2(n+1), B2(n+2)
#   define B6(n) B4(n), B4(n+1), B4(n+1), B4(n+2)
   B6(0), B6(1), B6(1), B6(2)
};

unsigned int v; // count the number of bits set in 32-bit value v
unsigned int c; // c is the total bits set in v

// Option 1:
c = BitsSetTable256[v & 0xff] + 
   BitsSetTable256[(v >> 8) & 0xff] + 
   BitsSetTable256[(v >> 16) & 0xff] + 
   BitsSetTable256[v >> 24]; 

// Option 2:
unsigned char * p = (unsigned char *) &v;
c = BitsSetTable256[p[0]] + 
    BitsSetTable256[p[1]] + 
    BitsSetTable256[p[2]] + 
    BitsSetTable256[p[3]];


// To initially generate the table algorithmically:
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
   BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
}

In particular, I dont understand the BitsSetTable256 definition at first. Why define these quantities B2, B4,... ? it seems to me that they are not used afterwards.

Could you hint at further doc on binary representation?

Thanks!


Solution

  • The definitions are to form the table by patterns. They are recursive macros, B6 uses B4 and B4 uses B2. B6(0) will get broken into:

    B4(0), B4(1), B4(1), B4(2)
    

    B4(0) will get broken into:

    0, 1, 1, 2
    

    The first few numbers of the sequence will be:

    // 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11
       0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3
    

    As you can see, these are the number of bits set for each index in the table.

    The rest of the algorithm is that you are breaking the number into 8-bit chunks and summing the number of bits set in each chunk, that's what these lines are about (you use either option 1 or option 2 at your liking, not both):

    // Option 1:
    c = BitsSetTable256[v & 0xff] + 
        BitsSetTable256[(v >> 8) & 0xff] + 
        BitsSetTable256[(v >> 16) & 0xff] + 
        BitsSetTable256[v >> 24]; 
    
    // Option 2:
    unsigned char * p = (unsigned char *) &v;
    c = BitsSetTable256[p[0]] + 
        BitsSetTable256[p[1]] + 
        BitsSetTable256[p[2]] + 
        BitsSetTable256[p[3]];
    

    The code at the bottom:

    // To initially generate the table algorithmically:
    BitsSetTable256[0] = 0;
    for (int i = 0; i < 256; i++)
    {
       BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
    }
    

    Is a different way of generating the BitsSetTable256. It generates the table at runtime instead of at compile-time (which is what the macro definition does.

    P.S. If you're targeting recent enough (SSE4) x86 you can use POPCNT instruction.