Search code examples
c++coptimizationfpu

division as multiply and LUT ? / fast float division reciprocal


Is it possible to make a reciprocal of float division in form of look up table (such like 1/f -> 1*inv[f] ) ? How it could be done? I think some and mask and shift should be appled to float to make it a form of index? How would be it exectly?


Solution

  • You can guess an approximate inverse like this:

    int x = bit_cast<int>(f);
    x = 0x7EEEEEEE - x;
    float inv = bit_cast<float>(x);
    

    In my tests, 0x7EF19D07 was slightly better (tested with the effects of 2 Newton-Raphson refinements included).

    Which you can then improve with Newton-Raphson:

    inv = inv * (2 - inv * f);
    

    Iterate as often as you want. 2 or 3 iterations give nice results.

    Better Initial Approximations

    To minimize the relative error:

    • 0x7EF311C2 (without refinement)
    • 0x7EF311C3 (1 refinement)
    • 0x7EF312AC (2 refinements)
    • 0x7EEEEBB3 (3 refinements)

    To minimize the absolute error for inputs between 1 and 2 (they work well enough outside that range, but they may not be the best):

    • 0x7EF504F3 (without refinement)
    • 0x7EF40D2F (1 refinement)
    • 0x7EF39252 (2 refinements)

    For three refinement steps, the initial approximation barely affects the maximum relative error. 0x7EEEEEEE works great, and I couldn't find anything better.