Search code examples
calgorithmoptimizationintegerbit-manipulation

what is the fastest way to count the number of matching digits in base 4?


Given two unsigned integers, what is the fastest way to count the number of matching digits in their base 4 representation?

example 1:

A= 13 = (31) in base 4

B= 15 = (33) in base 4

the number of matching digits in base 4 is 1.

example 2:

A= 163 = (2203) in base 4

B= 131 = (2003) in base 4

the number of matching digits in base 4 is 3.

The first step I guess is to calculate the bitwise XOR of the two integers, then we have to count number of 00 pairs ? what is the most efficient way t do that ?

note: assume that A and B have fixed number of digits in base 4, say exactly 16 digits.


Solution

  • Suppose, your ints are 4-byte each. 32 bits.

    The more understandable way:
    Help constant array:

    h[0]=3;
    for (int i=1; i<7; i++){
      h[i]=h[i-1]*4;
    }
    

    Later, for the check, if c is the integer after bitwise XOR :

    int count=0;
    for (int i=0; i<7; i++){
      if(c&h[i]==0)count++;
    }   
    

    Other solution. Obviously, faster, but a bit less understandable:

    int h[4]={1,0,0,0}
    
    int count=0;
    for (int i=0; i<15; i++){
      count+=h[c&3];
      c=c>>2;
    }   
    

    Further qickening:

    count= h[c&3] + h[(c=>>2)&3] + h[(c=>>2)&3] + h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3] + h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c>>2)&3];
    

    Even further:

    int h[16]={2,1,1,1, 1,0,0,0, 1,0,0,0, 1,0,0,0};
    count= h[c&15] + h[(c=>>4)&15] + h[(c=>>4)&15] + h[(c=>>4)&15]  + h[(c=>>4)&15] + h[(c=>>4)&15] + h[(c=>>4)&15]+ h[(c>>4)&15];
    

    If you really need use the function so many (10^10) times, count h[256] (you already caught, how), and use:

    count= h[c&255] + h[(c=>>8)&255] + h[(c=>>8)&255] + h[(c>>8)&255] ;
    

    I think, the help array h[256*256] would be also usable yet. Then

    count= h[c&255] + h[(c>>16)&(256*256-1)];
    

    The array of 2^16 ints will be all in the processor cash (third level, though). So, the speed will be really great.