Today in my C++ programming lessons, my proff told me that one should never compare two floating point values directly.
So I tried this piece of code and found out the reason for his statement.
double l_Value=94.9;
print("%.20lf",l_Value);
And I found the results as 94.89999999 ( some relative error )
I understand that floating numbers are not stored in the way one presents it to the code. Squeezing those ones and zeros in binary form involves some relative rounding errors.
Iam looking for solutions to two problems. 1. Efficient way to compare two floating values. 2. How to add a floating value to another one. Example. Add 0.1111 to 94.4345 to get the exact value as 94.5456
Thanks in advance.
- Efficient way to compare two floating values.
A simple double a,b; if (a == b)
is an efficient way to compare two floating values. Yet as OP noticed, this may not meet the overall coding goal. Better ways depend on the context of the compare, something not supplied by OP. See far below.
- How to add a floating value to another one. Example. Add 0.1111 to 94.4345 to get the exact value as 94.5456
Floating values as source code have effective unlimited range and precision such as 1.23456789012345678901234567890e1234567
. Conversion of this text to a double
is limited typically to one of 264 different values. The closest is selected, but that may not be an exact match.
Neither 0.1111, 94.4345, 94.5456
can be representably exactly as a typical double
.
OP has choices:
1.) Use another type other than double, float
. Various libraries offer decimal floating point types.
2) Limit code to rare platforms that support double
to a base 10 form such that FLT_RADIX == 10
.
3) Write your own code to handle user input like "0.1111"
into a structure/string and perform the needed operations.
4) Treat user input as strings and the convert to some integer type, again with supported routines to read/compute/and write.
5) Accept that floating point operations are not mathematically exact and handle round-off error.
double a = 0.1111;
printf("a: %.*e\n", DBL_DECIMAL_DIG -1 , a);
double b = 94.4345;
printf("b: %.*e\n", DBL_DECIMAL_DIG -1 , b);
double sum = a + b;
printf("sum: %.*e\n", DBL_DECIMAL_DIG -1 , sum);
printf("%.4f\n", sum);
Output
a: 1.1110000000000000e-01
b: 9.4434500000000000e+01
sum: 9.4545599999999993e+01
94.5456 // Desired textual output based on a rounded `sum` to the nearest 0.0001
More on #1
If an exact compare is not sought but some sort of "are the two values close enough?", a definition of "close enough" is needed - of which there are many.
The following "close enough" compares the distance by examining the ULP of the two numbers. It is a linear difference when the values are in the same power-of-two and becomes logarithmic other wise. Of course, change of sign is an issue.
float
example:
Consider all finite float
ordered from most negative to most positive. The following, somewhat-portable code, returns an integer for each float
with that same order.
uint32_t sequence_f(float x) {
union {
float f;
uint32_t u32;
} u;
assert(sizeof(float) == sizeof(uint32_t));
u.f = x;
if (u.u32 & 0x80000000) {
u.u32 ^= 0x80000000;
return 0x80000000 - u.u32;
}
return u.u3
}
Now, to determine if two float
are "close enough", simple compare two integers.
static bool close_enough(float x, float y, uint32_t ULP_delta) {
uint32_t ullx = sequence_f(x);
uint32_t ully = sequence_f(y);
if (ullx > ully) return (ullx - ully) <= ULP_delta;
return (ully - ullx) <= ULP_delta;
}