The popcount
function returns the number of 1's in an input. 0010 1101
has a popcount
of 4.
Currently, I am using this algorithm to get the popcount
:
private int PopCount(int x)
{
x = x - ((x >> 1) & 0x55555555);
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
return (((x + (x >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24;
}
This works fine and the only reason I ask for more is because this operation is run awfully often and I am looking for additional performance gains.
I'm looking for a way to simplify the algorithm based on the fact that my 1's will always be right aligned. That is, the input will be something like 00000 11111
(returns 5) or 00000 11111 11111
(returns 10).
Is there a way to make a more efficient popcount based on this constraint? If the input was 01011 11101 10011
, it would just return 2 because it only cares about the right-most ones. It seems any kind of looping is slower than the existing solution.
Here's a C# implementation that performs "find highest set" (binary logarithm). It may or may not be faster than your current PopCount, it surely is slower than using the real clz
and/or popcnt
CPU instructions:
static int FindMSB( uint input )
{
if (input == 0) return 0;
return (int)(BitConverter.DoubleToInt64Bits(input) >> 52) - 1022;
}
Test: http://rextester.com/AOXD85351
And a slight variation without a conditional branch:
/* precondition: ones are right-justified, e.g. 00000111 or 00111111 */
static int FindMSB( uint input )
{
return (int)(input & (int)(BitConverter.DoubleToInt64Bits(input) >> 52) - 1022);
}