Search code examples
gccfloating-pointc99ieee-754quadruple-precision

long double subnormals/denormals get truncated to 0 [-Woverflow]


In the IEEE754 standarad, the minimum strictly positive (subnormal) value is 2−16493 ≈ 10−4965 using Quadruple-precision floating-point format. Why does GCC reject anything lower than 10-4949? I'm looking for an explanation of the different things that could be going on underneath which determine the limit to be 10-4949 rather than 10−4965.

#include <stdio.h>

void prt_ldbl(long double decker) {
    unsigned char * desmond = (unsigned char *) & decker;
    int i;

    for (i = 0; i < sizeof (decker); i++) {
         printf ("%02X ", desmond[i]);
    }
    printf ("\n");
}

int main()
{
    long double x = 1e-4955L;
    prt_ldbl(x);
}

I'm using GNU GCC version 4.8.1 online - not sure which architecture it's running on (which I realize may be the culprit). Please feel free to post your findings from different architectures.


Solution

  • Your long double type may not be(*) quadruple-precision. It may simply be the 387 80-bit extended-double format. This format has the same number of bits for the exponent as quad-precision, but many fewer significand bits, so the minimum value that would be representable in it sounds about right (2-16445)


    (*) Your long double is likely not to be quad-precision, because no processor implements quad-precision in hardware. The compiler can always implement quad-precision in software, but it is much more likely to map long double to double-precision, to extended-double or to double-double.