Search code examples
x86-64precisionc++23long-double128-bit

c++ long double (128-bit) precision


I try to calculate factorial on a 64-bit system using the following code snippet:

long double FactorialLD(long n)
{
    long double result = 1;
    if (n <= 0)
        return 1;
    for (long i = 2; i <= n; i++)
        result *= i;
    return result;
}

30! returns 265252859812191058647452510846976. However, using desktop calculator or https://www.calculatorsoup.com/calculators/discretemathematics/factorials.php returns 265252859812191058636308480000000. Why is there such a big discrepancy and what's causing that?

CPU: Intel x86_64

OS: Ubuntu 24.10

Compiler: g++ 14.2.0


Solution

  • With gcc on x86-64 Linux, long double is the 80-bit extended precision format. This has a 64-bit significand which is not enough to exactly represent 30!, whose value is a bit less than 2^108.

    (If you check sizeof(long double) you will see it is 16, but that's simply for reasons of alignment. The actual value only uses 80 bits and the other 48 are unused padding.)

    If you did have true 128-bit quadruple precision, which has a 113-bit significand (counting the implicit leading 1 bit), then you would indeed be able to get the exact value of 30! in this way. For instance, ARM64 on Linux uses quad-precision for long double: try on godbolt.

    (Current versions of gcc on x86-64 actually do have some support for 128-bit quad precision arithmetic, not via long double but via the new _Float128 type. Try on godbolt. However, what's still lacking is standard library support; for instance, you can't yet input or output _Float128 with printf, iostream::operator<<, etc.)

    Be aware, though, that no mainstream CPU/FPU currently has hardware support for 128-bit quad precision, so computations on that type are all done in software and will be relatively slow. (For that matter, x86 is the only mainstream platform that even supports 80-bit, and only in its legacy x87 instruction set, so it's somewhat slower than single (32-bit) or double precision (64-bit) which can be done with the more modern SSE* instructions.)