When using Python & Julia, I can use a neat trick to investigate machine epsilon for a particular floating point representation.
For example, in Julia 1.1.1:
julia> 7.0/3 - 4/3 - 1
2.220446049250313e-16
julia> 7.0f0/3f0 - 4f0/3f0 - 1f0
-1.1920929f-7
I'm currently learning C and wrote this program to try and achieve the same thing:
#include <stdio.h>
int main(void)
{
float foo;
double bar;
foo = 7.0f/3.0f - 4.0f/3.0f - 1.0f;
bar = 7.0/3.0 - 4.0/3.0 - 1.0;
printf("\nM.E. for float: %e \n\n", foo);
printf("M.E. for double: %e \n\n", bar);
return 0;
}
Curiously, the answer I get depends on whether I use C11 or GNU11 compiler standard. My compiler is GCC 5.3.0, running on Windows 7 and installed via MinGW.
So in short, when I compile with: gcc -std=gnu11 -pedantic begin.c
I get:
M.E. for float: -1.192093e-007
M.E. for double: 2.220446e-016
as I expect, and matches Python and Julia. But when I compile with: gcc -std=c11 -pedantic begin.c
I get:
M.E. for float: -1.084202e-019
M.E. for double: -1.084202e-019
which is unexpected. I thought it might by GNU specific features which is why I added the -pedantic
flag. I have been searching on google and found this: https://gcc.gnu.org/onlinedocs/gcc/C-Extensions.html but I still am unable to explain the difference in behaviour.
To be explicit, my question is: Why is the result different using the different standards?
Update: The same differences apply with C99 and GNU99 standards.
In C, the best way to get the float
or double
epsilon is to include <float.h>
and use FLT_MIN
or DBL_MIN
.
The value of 7.0/3.0 - 4.0/3.0 - 1.0;
is not fully specified by the C standard because it allows implementations to evaluate floating-point expressions with more precision than the nominal type. To some extent, this can be dealt with by using casts or assignments. The C standard requires casts or assignments to “discard” excess precision. This is not a proper solution in general, because there can be rounding both with the initial excess precision and with the operation that “discards” excess precision. This double-rounding may produce a different result than calculating entirely with the nominal precision.
Using the cast workaround with the code in the question yields:
_Static_assert(FLT_RADIX == 2, "Floating-point radix must be two.");
float FloatEpsilon = (float) ((float) (7.f/3) - (float) (4.f/3)) - 1;
double DoubleEpsilon = (double) ((double) (7./3) - (double) (4./3)) - 1;
Note that a static assertion is required to ensure that the floating-point radix is as expected for this kludge to operate. The code should also include documentation explaining this bad idea: