performanceassemblybit-manipulationmathapproximation# Division by a constant using shifts and adds/subtracts

Hi all I'm trying to divide by an unsigned constant using only shifts and adds/subtracts - I have no problem with this if it were multiplication, but I'm a bit stumped by the division.

For example, lets say the constant divisor is 192 and lets say the dividend is 8000

"full result" y = 8000/192 = 41 (assuming I'm not keeping fractional bits)

y = 8000 >> 8 ... 31 y = 8000 >> 7 ... 62

But how do I get a more accurate solution?

Many thanks!

Solution

There is almost certainly a more elegant way to do it, but here's enough to get you started.

Usually division in this way is done by multiplying by the reciprocal, i.e., first multiplying and then right-shifting.

(Note: remember that multplication can be accomplished by shifts and adds (e.g. `n * 3 = (n*2) + (n*1) = (n << 1) + (n) )`

but I'm just going to use multiplication here. Your question said "shifts & adds" and I'm justifying my shorthand use of multiplication)

In the examples below, I'm trying to explain the concepts with an example. In your specific case, you'll want to consider issues such as

sign (I'm using unsigned ints below)

overflow (below I'm using 32-bit unsigned longs to hold intermediate values but if you're on a smaller uC beware, adjust accordingly

rounding (e.g. should 9/5 return 1 or 2? In C, it's 1, but maybe you want 2 because it's closer to the correct answer?)

To the extent that you can (available bits), do your all of your multiplies before your divides (minimizing truncation errors). Again, be aware of overflow.

As I said, read the below to understand the concepts, then tailor to your needs.

Dividing by 192 is the same as multiplying by 1/192, which is the same as dividing by (64 * 3). There is not an exact (finite) binary representation of 1/3, so we're approximating it with 0x5555/(1 << 16).

To divide by 192, we divide by 64 and then divide by 3. To divide by 3, we multiply by 0x5555 and shift right by 16 (or multiply by 0x55 and >> 8, or...)

```
// 8000/192 =
// ((8000/64)/3) =
// ((8000 >> 6) / 3) =
// (((8000 >> 6) * 0x5555) >> 16)
// (((8000 * 0x5555) >> 22
```

Please note that the parentheses are intentional. You **don't** want to compute `(8000 * (0x5555/(1 << 16))`

because the 2nd term is 0, and the product would be 0. Not good.

So a 1-liner in code would be something like:

```
printf("Answer: %lu\n", ((8000UL * 0x5555UL) >> 22));
```

This will yield 41, which is what "C" would yield for `8000/192`

, even though 42 is "closer". By checking LSBs you could round if you wanted to.

One could write a treatise on this topic, but fortunately **someone much smarter than me already has**.

- Performance: XmlReader or LINQ to XML
- using Numba for scipy fsolve, but get error
- Performance degradation in AKS Windows node pool vs. virtual machine-hostes application in IIS
- Is f(int const) better than f(int) for compiler optimization?
- Collections and the Powershell Pipeline
- When should i use async/await in Python?
- Performance impact due to store.Open and store.Certificates.Find methods in X509Store
- JavaScript Set vs. Array performance
- Measure the time it takes to execute a t-sql query
- Why is windows so slow in opening files first time and is there a faster way
- How do I profile a Python script?
- What is the fastest way to decide if a digit appears in a string?
- Compare List of objects in java script
- How can I efficiently execute multiple C++ benchmarking algorithms using Windows cmd
- Firebase Firestore Query Performance
- Efficiency of while(true) ServerSocket Listen
- OpenGL core profile incredible slowdown on OS X
- Is there a container similar to `std::deque` but with custom block size and better performance?
- minimize() - iteration time keeps increasing
- StringBuffer and String usage scenario in Java
- How can I accurately benchmark unaligned access speed on x86_64?
- How to analyse apache log on 1 server with 4 cores faster
- Query Execution time in Management Studio & profiler. What does it measure?
- In Julia, how to convert a unsigned number to a signed number like in C?
- harvesine vectorization for vector list
- BIGSERIAL vs SERIAL in PostgreSQL
- Should using "any" be avoided when optimising code in Swift?
- Compute the max sum circular area
- Selecting the maximum number of values from a column that share values in another column in R
- Google Sheets Query to extract data and replace checkboxes with values. (Diff)