Here is a pseudocode of my problem.
I have an array of IEEE 754 double precision positive numbers.
The array can come in a random order but numbers are always the same, just scrambled in their positions. Also these numbers can vary in a very wide range in the valid IEEE range of the double
representation.
Once I have the list, I initialize a variable:
double sum_result = 0.0;
And I accumulate the sum on sum_result
, in a loop over the whole array. At each step I do:
sum_result += my_double_array[i]
Is it guaranteed that, whatever the order of the initial array of double
, if the numbers are the same, the printed out sum result will be always the same?
Is it guaranteed that, whatever the order of the initial array of double, if the numbers are the same, the printed out sum result will be always the same?
No, FP addition is not associative. Remember it is called floating point - the absolute precision "floats" about relative to 1.0. Any given operation like addition (+
) is subject to round-off error.
Yet if the sum is done and the inexact flag is clear, then yes, the order was not relevant.**
Simple counter example.
#include <math.h>
#include <float.h>
#include <fenv.h>
#include <stdio.h>
int main(void) {
double a[3] = { DBL_MAX, -DBL_MAX, 1.0 };
fexcept_t flag;
feclearexcept(FE_ALL_EXCEPT);
printf("%e\n", (a[0] + a[1]) + a[2]);
fegetexceptflag(&flag, FE_INEXACT);
printf("Inexact %d\n", !!(flag & FE_INEXACT));
feclearexcept(FE_ALL_EXCEPT);
printf("%e\n", a[0] + (a[1] + a[2]));
fegetexceptflag(&flag, FE_INEXACT);
printf("Inexact %d\n", !!(flag & FE_INEXACT));
printf("%d\n", FLT_EVAL_METHOD);
return (EXIT_SUCCESS);
}
Output
1.000000e+00 // Sum is exact
Inexact 0
0.000000e+00 // Sum is inexact
Inexact 1
0 // evaluate all operations ... just to the range and precision of the type;
Depending on FLT_EVAL_METHOD
, FP math may use wider precession and range, yet the above extreme example sums will still differ.
** aside from maybe a result of 0.0 vs -0.0
To see why, try a based 10 text example with 4 digits of precision. The same principle applies to double
with its usual 53 binary digits of precision.
a[3] = +1.000e99, -1.000e99, 1.000
sum = a[0] + a[1] // sum now exactly 0.0
sum += a[2] // sum now exactly 1.0
// vs.
sum = a[1] + a[2] // sum now inexactly -1.000e99
sum += a[0] // sum now inexactly 0.0
Re: "printed out sum result will be always the same" : Unless code prints with "%a"
or "%.*e"
with higher enough precision, the text printed may lack significance and two different sums may look the same. See Printf width specifier to maintain precision of floating-point value