Search code examples
calgorithmmathfixed-point

Fixed point code division understanding


Code for division by 9 in fixed point.

  1. q = 0;              // quotient
  2. y = (x << 3) - x;   // y = x * 7
  3. while(y) {          // until nothing significant
  4.   q += y;           // add (effectively) binary 0.000111
  5.   y >>= 6;          // realign
  6. }
  7. q >>= 6;            // align

Line 2 through 5 in the FIRST execution of while loop is effectively doing
x*.000111 (in decimal representation x*0.1), what it is trying to achieve in subsequent while loops?

Should it not be again multiplying that with 7 and again shifting instead of doing only shifting to take care of recurrence?

Explanation with respect to plain decimal number multiplication as to what is being achieved with only shifting would be nice.

Detailed code explanation here: Divide by 9 without using division or multiplication operator


Solution

  • Lets denote 7/64 by the letter F. 7/64 is represented in binary as 0.000111 and is very close to 1/9. But very close is not enough. We want to use F to get exactly to 1/9. It is done in the following way

    F+ (F/64) + (F/64^2) + (F/64^3) + (F/64^4)+ (F/64^5) + ...
    

    As we add more elements to this sequence the results gets closer to 1/9 Note each element in the sequence is exactly 1/64 from previous element.

    A fast way to divide by 64 is >>6

    So effectively you want to build a loop which sums this sequence. You start from F and in each iteration do F>>6 and add it to the sum. Eventually (after enough iterations) the sum will be exactly 1/9.

    Ok now, you are ready to understand the code. Instead of using F (which is a fraction and cannot be represented in fixed points) the code multiplies F by x. So the sum of the sequence will be X/9 instead of 1/9 Moreover in order to work with fixed points it is better to store 64*X*F and the result would by 64*X/9.

    Later after the summation we can divide by 64 to get the X/9

    The code stores in the variable y the value of F*x*64

    variable q stores the sum of the sequence. In each loop iteration we generate the next element in the sequence by dividing the previous element by 64 (y>>=6)

    Finally after the loop we divide the sum by 64 (q>>=6) and get the result X/9.

    Regarding you question. We should not multiply by 7 each time or else we will get a sum of the sequence of

    F+ (F^2) + (F^3) + (F^4) + (F^5)...
    

    This would yield a result of ~X/(8+1/7) instead of X/9.