performance assembly bit-manipulation math approximation

Division by a constant using shifts and adds/subtracts

Hi all I'm trying to divide by an unsigned constant using only shifts and adds/subtracts - I have no problem with this if it were multiplication, but I'm a bit stumped by the division.

For example, lets say the constant divisor is 192 and lets say the dividend is 8000

"full result" y = 8000/192 = 41 (assuming I'm not keeping fractional bits)

y = 8000 >> 8 ... 31 y = 8000 >> 7 ... 62

But how do I get a more accurate solution?

Many thanks!

Solution

There is almost certainly a more elegant way to do it, but here's enough to get you started.

Usually division in this way is done by multiplying by the reciprocal, i.e., first multiplying and then right-shifting.

(Note: remember that multplication can be accomplished by shifts and adds (e.g. n * 3 = (n*2) + (n*1) = (n << 1) + (n) ) but I'm just going to use multiplication here. Your question said "shifts & adds" and I'm justifying my shorthand use of multiplication)

In the examples below, I'm trying to explain the concepts with an example. In your specific case, you'll want to consider issues such as

sign (I'm using unsigned ints below)
overflow (below I'm using 32-bit unsigned longs to hold intermediate values but if you're on a smaller uC beware, adjust accordingly
rounding (e.g. should 9/5 return 1 or 2? In C, it's 1, but maybe you want 2 because it's closer to the correct answer?)
To the extent that you can (available bits), do your all of your multiplies before your divides (minimizing truncation errors). Again, be aware of overflow.

As I said, read the below to understand the concepts, then tailor to your needs.

Dividing by 192 is the same as multiplying by 1/192, which is the same as dividing by (64 * 3). There is not an exact (finite) binary representation of 1/3, so we're approximating it with 0x5555/(1 << 16).

To divide by 192, we divide by 64 and then divide by 3. To divide by 3, we multiply by 0x5555 and shift right by 16 (or multiply by 0x55 and >> 8, or...)

//                8000/192          =
//                ((8000/64)/3)     =
//                ((8000 >> 6) / 3) =
//                (((8000 >> 6) * 0x5555) >> 16)
//                (((8000 * 0x5555) >> 22

Please note that the parentheses are intentional. You don't want to compute (8000 * (0x5555/(1 << 16)) because the 2nd term is 0, and the product would be 0. Not good.

So a 1-liner in code would be something like:

 printf("Answer:  %lu\n", ((8000UL * 0x5555UL) >> 22));

This will yield 41, which is what "C" would yield for 8000/192, even though 42 is "closer". By checking LSBs you could round if you wanted to.

One could write a treatise on this topic, but fortunately someone much smarter than me already has.