If the a floating point number could be outputted so that there was no truncation of value (say with setpercision
) and the number was outputted in fixed notation (say with fixed
) what is the buffer size that would be required to guarantee the entire fractional part of the floating point number could be stored in the buffer?
I'm hoping there is something in the standard, like a #define
or something in numeric_limits
which would tell me the maximum base-10 value place of the fractional part of a floating point type.
I asked about the maximum number of base-10 digits in the fractional part of a floating point type here: What Are the Maximum Number of Base-10 Digits in the Integral Part of a Floating Point Number
But I realize this may be more complex. For example, 1.0 / 3.0
is an infinitely repeating series of numbers. When I output that using fixed
formatting I get this many places before repeating 0s:
0.333333333333333314829616256247390992939472198486328125
But I can't necessarily say that's the maximum precision, cause I don't know how many of those trailing 0s were actually represented in the floating point's fraction, and it hasn't been shifted down by a negative exponent.
I know we have min_exponent10
is that what I should be looking to for this?
If you consider the 32 and 64 bit IEEE 754 numbers, it can be calculated as described below.
It is all about negative powers of 2. So lets see how each exponent contribute:
2^-1 = 0.5 i.e. 1 digit
2^-2 = 0.25 i.e. 2 digits
2^-3 = 0.125 i.e. 3 digits
2^-4 = 0.0625 i.e. 4 digits
....
2^-N = 0.0000.. i.e. N digits
as the base-10 numbers always end with 5, you can see that the number of base-10 digits increase by 1 when the exponent descrease by 1. So 2^(-N) will require N digits
Also notice that when adding those contributions, the number of resulting digits is determined by the smallest number. So what you need to find out is the smallest exponent that can contribute.
For 32 bit IEEE 754 you have:
Smallest exponent -126
Fraction bits 23
So the smallest exponent is -126 + -23 = -149, so the smallest contribution will come from 2^-149, i.e.
For 32 bit IEEE 754 printed in base-10 there can be 149 fractional digits
For 64 bit IEEE 754 you have:
Smallest exponent -1022
Fraction bits 52
So the smallest exponent is -1022 + -52 = -1074, so the smallest contribution will come from 2^-1074, i.e.
For 64 bit IEEE 754 printed in base-10 there can be 1074 fractional digits