Search code examples
c++doublegarbage

Reason of strange output of a c++ program


Here is code of a simple c++ program:

#include <bits/stdc++.h>
using namespace std;

int main() {
    double x = 3;
    double y = .15;
    while(x>0) {
        printf("%.15f ",x);
        cout<<x<<endl;
        x-=y;
    }
    return 0;
}

output:

> 3.000000000000000 3
> 2.850000000000000 2.85
> 2.700000000000000 2.7
> 2.550000000000000 2.55
> 2.400000000000000 2.4
> 2.250000000000000 2.25
> 2.100000000000001 2.1
> 1.950000000000001 1.95
> 1.800000000000001 1.8
> 1.650000000000001 1.65
> 1.500000000000001 1.5
> 1.350000000000001 1.35
> 1.200000000000001 1.2
> 1.050000000000001 1.05
> 0.900000000000001 0.9
> 0.750000000000001 0.75
> 0.600000000000001 0.6
> 0.450000000000001 0.45
> 0.300000000000001 0.3
> 0.150000000000001 0.15
> 0.000000000000001 1.05471e-15

Now look at 7th line. isn't that strange? Here for some reason 2.25 - .15 is being 2.100000000000001. I know that can be avoided by using float. But I wanted to know why exactly is that happening.


Solution

  • Your question illustrates floating point error - as dww describes.

    Because computers are binary, they do not store floating point accurately if they have more than a certain about of digits.

    To illustrate, integers are stored in binary very precisely. 8-bits, 0000 0000, can represent 2^8 values, or 0-256. 16-bits, 0000 0000 0000 0000, can represent 2^16 values, or 0-65536. Negative values are stored using 2's compliment, so are stored just as precisely as positive numbers.

    Note how none of the integers need a "floating-point" to be represented.

    A floating point is trickier to represent because the position of the point determines the exact value, and it may be displayed in different places. If there are 16 number positions, 0000000000000000, the floating point can be anywhere along the line. 16.00027562809373 could be represented, 756352.8363578476, or 0.000000000000001.

    BUT there is a finite amount of values available for floating-point, just like integers. In your output, it looks like any number out-of-range of 0.000000000000001 < x < 999999999999999.9 could not be accurately represented, neither could significand > 2^52 (in floating-point) be stored in 64bit memory.

    Because your significand field is 16 digits long, I'm guessing that's your output's field width. (I can't make any guess what system you're running, because that also determines the floating point accuracy of your computer. Meaning a 64bit system has far more precision that a 32bit system.)

    In a 64bit system, a floating point number would have this structure in memory:

    0 | 000 0000 00000 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

    ^---------------^------------------------------------------------------------^----------------------------------------------

    +/-_____exponent_____________________________significand___________________

    The significand is were the numbers are stored, the exponent determines where the floating-point will be, and the sign determines if it is positive or negative.

    The trailing '1' at the end indicates there is some value smaller than 0.000000000000001 which the computer thinks is there, but it is rounding up. At the end, the number is so small and close to zero, it is represented as 0.000000000000001 1.05471e-15; which is equal to 0.00000000000000105471 and still positive.