Search code examples
c++c++11floating-pointx86-64ieee-754

Range of representable values of 32-bit, 64-bit and 80-bit float IEEE-754?


In the C++ standard it says of floating literals:

If the scaled value is not in the range of representable values for its type, the program is ill-formed.

The scaled value is the significant part multiplied by 10 ^ exponent part.

Under x86-64:

  • float is a single-precision IEEE-754
  • double is a double-precision IEEE-754
  • long double is an 80-bit extended precision IEEE-754

In this context, what is the range of repsentable values for each of these three types? Where is this documented? or how is it calculated?


Solution

  • The answer (if you're on a machine with IEEE floating point) is in float.h. FLT_MAX, DBL_MAX and LDBL_MAX. On a system with full IEEE support, something around 3.4e+38, 1.8E+308 and 1.2E4932. (The exact values may vary, and may be expressed differently, depending on how the compiler does its input and rounding. g++, for example, defines them to be compiler built-ins.)

    EDIT:

    WRT your question (since neither I nor the other responders actually answered it): the range of representable values is [-type_MAX...type], where type is one of FLT, DBL, or LDBL.