Search code examples
c++coptimizationfloating-pointieee-754

How does this float square root approximation work?


I found a rather strange but working square root approximation for floats; I really don't get it. Can someone explain me why this code works?

float sqrt(float f)
{
    const int result = 0x1fbb4000 + (*(int*)&f >> 1);
    return *(float*)&result;   
}

I've test it a bit and it outputs values off of std::sqrt() by about 1 to 3%. I know of the Quake III's fast inverse square root and I guess it's something similar here (without the newton iteration) but I'd really appreciate an explanation of how it works.

(nota: I've tagged it both and since it's both valid-ish (see comments) C and C++ code)


Solution

  • (*(int*)&f >> 1) right-shifts the bitwise representation of f. This almost divides the exponent by two, which is approximately equivalent to taking the square root.1

    Why almost? In IEEE-754, the actual exponent is e - 127.2 To divide this by two, we'd need e/2 - 64, but the above approximation only gives us e/2 - 127. So we need to add on 63 to the resulting exponent. This is contributed by bits 30-23 of that magic constant (0x1fbb4000).

    I'd imagine the remaining bits of the magic constant have been chosen to minimise the maximum error across the mantissa range, or something like that. However, it's unclear whether it was determined analytically, iteratively, or heuristically.


    It's worth pointing out that this approach is somewhat non-portable. It makes (at least) the following assumptions:

    • The platform uses single-precision IEEE-754 for float.
    • The endianness of float representation.
    • That you will be unaffected by undefined behaviour due to the fact this approach violates C/C++'s strict-aliasing rules.

    Thus it should be avoided unless you're certain that it gives predictable behaviour on your platform (and indeed, that it provides a useful speedup vs. sqrtf!).


    1. sqrt(a^b) = (a^b)^0.5 = a^(b/2)

    2. See e.g. https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Exponent_encoding