Search code examples
cnumerical-methodsfixed-point

Algorithm for fixed-point multiplication


I'm trying to rescale a timestamp (fractional part of seconds only) from nanoseconds (units of 10^-9 seconds) to the lower half of an NTP timestamp (units of 2^-32 seconds). Effectively this means multiplying by 4.2949673. But I need to do it without floating-point math, and without using integers larger than 32 bits (in fact, I'm actually writing this for an 8-bit microcontroller, so even 32-bit math is expensive, especially divisions).

I've come up with a few algorithms that work reasonably well, but I don't have any real grounding in numerical methods so I'd appreciate any suggestions as to how to improve them, or any other algorithms that would be more accurate and/or faster.

Algorithm 1

uint32_t intts = (ns >> 16) * 281474 + (ns << 16) / 15259 + ns / 67078;

The first two constants were chosen to slightly undershoot, rather than overshoot, the correct figure, and the final factor of 67078 was determined empirically to correct for this. Produces results within +/- 4 NTP units of the correct value, which is +/- 1 ns -- acceptable, but the residual changes with ns. I guess I could add another term.

Algorithm 2

uint32_t ns2 = (2 * ns) + 1;
uint32_t intts = (ns2 << 1)
  + (ns2 >> 3) + (ns2 >> 6) + (ns2 >> 8) + (ns2 >> 9) + (ns2 >> 10)
  + (ns2 >> 16) + (ns2 >> 18) + (ns2 >> 19) + (ns2 >> 20) + (ns2 >> 21)
  + (ns2 >> 22) + (ns2 >> 24) + (ns2 >> 30) + 3;

Based on the binary expansion of 4.2949673 (actually based on the binary expansion of 2.14748365, since I start by doubling and adding one to accomplish rounding). Possibly faster than algorithm 1 (I haven't gotten out the benchmarks yet). The +3 was determined empirically to cancel out undershoot from truncating all those low-order bits, but it doesn't do the best possible job.


Solution

  • uint32_t convert(uint32_t x) {
        const uint32_t chi = 0x4b82;
        const uint32_t clo = 0xfa09;
        const uint32_t round = 0x9525;
        const uint32_t xhi = x >> 16;
        const uint32_t xlo = x & 0xffff;
        uint32_t lowTerm = xlo*clo;
        uint32_t crossTerms = xhi*clo + xlo*chi;
        uint32_t rounded = crossTerms + (lowTerm >> 16) + round >> 16;
        uint32_t highTerm = xhi*chi;
        return (x << 2) + highTerm + rounded;
    }
    

    Basic fixed-point multiplication, simulating a 32x32 -> 64 product using four 16x16 -> 32 products. The constant round was chosen to minimize the error using a simple binary search. This expression is good to +/-0.6 NTP over the entire valid range.

    The leading 4 in the scale factor is handled in the shift. Compilers typically can generate pretty decent code for this type of thing, but it can often be streamlined with hand-written assembly if you need to.

    If you don't need so much accuracy, you can get rid of the lowTerm and round and get an answer that's good to +/-1.15 NTP:

    uint32_t convert(uint32_t x) {
        const uint32_t chi = 0x4b82;
        const uint32_t clo = 0xfa09;
        const uint32_t xhi = x >> 16;
        const uint32_t xlo = x & 0xffff;
        uint32_t crossTerms = xhi*clo + xlo*chi;
        uint32_t highTerm = xhi*chi;
        return (x << 2) + highTerm + (crossTerms >> 16) + 1;
    }