Search code examples
c++bitdouble-precisiongranularity

double granularity affecting numeric comparisons c++


Below is a test block of code that is supposed to determine equivalency of 2 doubles within a tolerance.

double lhs_1 = 0.02;
double lhs_2 = 0.04;
double rhs = 0.03;
double tolerance = 0.01;

bool is_match_1 = (abs(lhs_1 - rhs) <= tolerance);
bool is_match_2 = (abs(lhs_2 - rhs) <= tolerance);

However is_match_2 turns out false where is_match_1 turns out true. I understand that numbers stored in the computer are discreet values and not continuous. Can someone share a solution? I would like to err on the side of passing the test within reason. Is there a way to increment the value of a double by 1 for whatever precision it currently has (I am not familiar with the bit layout of a double)? Because I might just increment the tolerance value to allow for this granularity issue.

EDIT:

When this is really implemented the user will define inputs and tolerances, so I'm just trying to give them an expected output for whatever values they enter.


Solution

  • Unfortunately, there are no "good" rules for choosing the tolerance.

    You have at your disposal the "machine epsilon"

    double epsilon = std::numeric_limits<double>::epsilon()
    

    which is the smallest value which, added to one, gives a result different from one.

    I usually write my tolerances as a function of epsilon. There are no good rules, but for instance comparing like this

    bool fuzzy_equals(double a, double b)
    {
        static const double eps = std::numeric_limits<double>::epsilon();
        return std::fabs(b - a) < 1024 * eps * std::max(a, b);
    }
    

    works well in many cases. You can adjust the 1024, I like powers of two, but you might not. The actual value you choose is problem-dependent. Epsilon for doubles is around 10^-16, so 1024 is quite small, and you will need a bigger number in many cases (virtually any operation, including the minus operation inside fuzzy_equals will "eat" one epsilon -- they can cancel out, but on average, n operations mean sqrt(n) * epsilon precision, so 1024 corresponds to the expected precision after one million operations).

    In other cases, where the precision is not as good, for instance when testing the minimum of a function against a known value (minima are usually only determined up to sqrt(eps) accuracy), I use

    bool fuzzy_equals2(double a, double b)
    {
        static const double eps = std::numeric_limits<double>::epsilon();
        return std::fabs(b - a) < 1024 * std::sqrt(eps) * std::max(a, b);
    }
    

    I often use other functions, like std::pow(eps, something), or even -1 / std::log(eps). This depends on what prior information I can derive from the problem, and what is the error I expect.

    When it comes to code structure, I use a functional approach and pass a comparer to my algorithms, a bit like STL predicates. This enables you not to hardcode the logic of comparing into your algorithms.

    In short, there is no one-size-fits-all rule. You have to choose depending on the problem