Search code examples
floating-pointdoubleieee-754

Does "0.00x <= 0.001 * x" always hold for double?


Example:

0.008 == 0.001 * 8
0.009  < 0.001 * 9
0.035 == 0.001 * 35
0.036  < 0.001 * 36

I've also tested patterns like:

0.0x   <= 0.01   * x
0.000x <= 0.0001 * x

I've tested thousands in MATLAB and C.

My questions:

  1. If it always holds, why? If not, any counterexamples?

  2. If it always holds, are all below patterns correct?

    0.0000000000000x <= 0.0000000000001 * x ; for arbitrary zeros


Solution

  • .000005 <= .000001 * 5 evaluates to false.

    Matlab specifies double to be IEEE-754 binary64, but C standard does not, although many implementations use it. Commonly, bases two and ten are used for floating-point. In base ten, the posed relation holds, and in fact the two expressions are equal up to the bounds of the format. (When x is so large that converting it to double yields infinity, equality does not hold.) In other bases, counterexamples similar to those shown in this answer can be found.1 For the remainder of this answer, IEEE-754 binary64 is assumed, in both the format of the numbers and the behavior of the operations.

    We should understand how an expression such as .000…000x <= .000…0001 * x is evaluated:

    • First, each numeral in the source text is converted to double. During this conversion, the number is rounded to a representable value. This rounding is commonly done by rounding to the nearest representable value, with ties to the value with the even low digit (bit).
    • Then the multiplication is performed, and the result is as if the real number result is rounded to a representable value. Again, rounding to nearest is common.
    • Then the comparison <= is evaluated. This has no error; it produces true if and only if the right operand is greater than or equal to the left operand.

    I assume x is non-negative. For negative x, see the first addendum.

    First, consider using round-to-nearest.

    Converting .000001 to double yields 0.000000999999999999999954748111825886258685613938723690807819366455078125. This can be seen in a good C implementation with printf(".99f\n", .000001);. (Since the C standard neither fully specifies the conversion of decimal numerals to floating-point while compiling nor the conversion of floating-point to decimal by printf, there can be variation in implementations that do not use correct rounding-to-nearest.) As we can see, this is less than .000001. This is easily found by printing .1, .01, .001, and so on until we find a numeral that happens to round down. Then we test various x until we find one for which .00000x happens to round up. Converting .000005 to double yields 0.0000050000000000000004090152695701565477293115691281855106353759765625. Next, for small integers x, x is exactly representable in double, so converting that operand to double has no rounding error. We have nearly satisfied .00000x <= .000001 * x, since the left side contains a rounding up while the right side contains a rounding down. However, the multiplication could also have a rounding up that undoes our preparation. Again, testing a few x is likely to find an example where this does not occur—if the multiplication is not exact, then whether it happens to round up or down essentially depends on a single bit somewhere in x, so a few trials suffices to find a working example.

    Returning to the original scale, .001 instead of .000001, we can say .00x <= .001 * x holds for all x less than 253. This is because all such integers x are exactly representable in double, and the conversion of .001 to double rounds up, producing 0.001000000000000000020816681711721685132943093776702880859375. Therefore, even if the left side, .00x, rounds down, the right side contains a rounding up that can only be compensated for by a rounding down in the multiplication, as there is no rounding in the conversion of x to double. As each rounding can only move to the next representable value, not skip over any, a rounding down in the multiplication cannot bring the right side below the left. So .00x <= .001 * x holds for all x less than 253.

    Above this, converting x to double may cause a rounding error, and searching finds a counterexample easily (in the fourth odd number above 253): 9007199254741.001 <= .001 * 9007199254741001, for which converting 9007199254741.001 to double yields 9007199254741.001953125, converting 9007199254741001 to double yields 9007199254741000, and the right side evaluates to 9007199254741.

    If we consider other rounding modes:

    • With rounding toward −∞ (rounding down) or toward 0, the right side would suffer up to three rounding errors down, so we can expect numerous cases in which the relation fails.
    • With rounding toward +∞ (rounding up), the right side must experience at least one rounding error up, because .000…0001 is never exactly representable in base-two floating-point, and a rounding-up rule never permits the right side to shed this error, and the left side can have only a single rounding error that never overtakes the error on the right, so the relation must always hold.

    Addenda

    Can x be negative? The syntax of the question suggests no, as, for x = −3, .00x would become .00-3, which does not form a correct numeral. If we allow this to be -.003, then .00x <= .001 * x fails when x is so great in magnitude that converting it to double yields −∞ but is not so great that converting .00x to double yields −∞. In that case, we compare a finite value to −∞, and the comparison produces false. Within the bounds of the double format, where values remain finite, the comparison would suffer the issues discussed above, subject to some modification regarding the rounding rules.

    Note that if .000…0001 is sufficiently small, converting it to double may yield zero (in any standard rounding mode other than toward +∞), and x may be sufficiently large that it yields ∞, in which case multiplying them yields a NaN (or trap), and the relation does not hold because NaNs have no relation to numbers.

    Footnote

    1 In bases that are multiples of ten, there may be some unusual interactions that this answer does not explore.