Search code examples
c++boost-geometry

How do Boost.Geometry operations like union_ deal with the fundamental imprecision of floating point types?


I'm trying to judge whether / how I can make Boost.Geometry work for a particular use case. I cannot, however, find documentation about how the library deals with floating point types anywhere.

If you search the official documentation for the word "epsilon" you get zero hits as far as I can tell; however, it is clear from the library's behavior that it is implicitly using some version of the typical way one deals with floats when making comparisons because, for example, the union_ operation will union two polygons that are near each other but not overlapping if they are near enough.

Consider for example the following code which performs a binary search to determine the threshold distance that two unit squares need to be within to be considered adjacent when union-ing:

namespace bg = boost::geometry;

using point = bg::model::d2::point_xy<double>;
using polygon = bg::model::polygon<point, false>;

polygon create_poly(std::vector<std::tuple<double, double>> pts) {
    polygon poly;
    for (const auto& [x, y] : pts)
        bg::append(poly, bg::make<point>(x, y));
    auto [x_1, y_1] = pts[0];
    bg::append(poly, bg::make<point>(x_1, y_1));
    return poly;
}

bool perform_simple_union(const polygon& p1, const polygon& p2) {
    std::vector<polygon> output; 
    bg::union_(p1, p2, output);
    return output.size() == 1;
}

double find_epsilon(double left, double right) {

    if (right - left < std::numeric_limits<double>::epsilon())
        return left;
    double eps = (left + right) / 2;

    polygon a = create_poly(
        std::vector<std::tuple<double, double>>{
            {1.0, 1.0}, { 2.0,1.0 }, { 2.0, 2.0 }, { 1.0,2.0 }
        }
    );

    polygon b = create_poly(
        std::vector<std::tuple<double, double>>{
            {2.0 + eps, 1.0}, { 3.0 + eps, 1.0 }, { 3.0 + eps, 2.0 }, { 2.0 + eps,2.0 }
        }
    );

    if ( perform_simple_union(a, b) ) {
        return find_epsilon(eps, right);
    } else {
        return find_epsilon(left, eps);
    }
}

int main()
{
    auto eps = find_epsilon(0.0, 1.0);
    std::cout << "eps == " << eps << "\n";
}

when I compile and run the above with Visual Studio I get the output

eps == 1e-07

which is about the numeric limits epsilon of single precision floats. So it's treating double precision coordinates as if they are equivalent if they are within single precision epsilon from each other?

Basically I'd just like to know what the default behavior is so I can decide if it works for me.


Solution

  • In [the intro][1], it states:

    The library supports high precision arithmetic numbers, such as ttmath. [1]: https://www.boost.org/doc/libs/1_70_0/libs/geometry/doc/html/geometry/introduction.html

    The library design rationale goes into this a tiny bit more:

    [...], it would be too long, and it is not related to geometry. We just assume that there is a meta-function select_most_precise selecting the best type.

    They also implemented along the OGC Simple Feature Specification, which probably means that you can find more algorithmic robustness guarantees there.

    I know from reading the code that there are certain algorithms that take into account edge cases where the outcome can be made more robust (by doing operations in a certain order or noticing when features are very close, IIRC). A simple grep for e.g. robust might show you some in-roads there:

    policies/robustness/robust_point_type.hpp:// Meta-function to typedef a robust point type for a poli

    algorithms/detail/overlay/get_turn_info_helpers.hpp: // Used ranges - owned by get_turns or (for

    algorithms/detail/overlay/get_turn_info_helpers.hpp:// Version with rescaling, having robust points

    algorithms/detail/overlay/append_no_dups_or_spikes.hpp: // Try using specified robust policy

    I'm merely grazing the surface here, I don't claim to understand much of what is being noted there.

    Using arbitrary precision or decimals

    Precision is one dimension, source-fidelity when the input is in decimal form is another. Short of going to MPFR/GMP/ttmath (as mentioned) you can easily drop in Boost Multiprecision. This gives you fast proof-of-concept since it ships with boost, and also allows you to switch to GMP or MPFR backends transparently.

    See also:

    Live On Coliru

    #include <boost/geometry.hpp>
    #include <boost/multiprecision/cpp_dec_float.hpp>
    #include <iostream>
    namespace mp = boost::multiprecision;
    namespace bg = boost::geometry;
    
    //// Note, cpp_dec_float<0> is variable-precision!
    // using Number = mp::number<mp::cpp_dec_float<0>, mp::et_off>;
    
    // Fixed precision, avoids allocating and populates std::numeric_limits<>
    // with concrete data
    using Number = mp::number<mp::cpp_dec_float<50>, mp::et_off>;
    
    using point = boost::geometry::model::d2::point_xy<Number>;
    using polygon = bg::model::polygon<point, false>;
    
    polygon create_poly(std::vector<std::tuple<Number, Number>> pts) {
        polygon poly;
        for (const auto& [x, y] : pts)
            bg::append(poly, bg::make<point>(x, y));
        auto [x_1, y_1] = pts[0];
        bg::append(poly, bg::make<point>(x_1, y_1));
        return poly;
    }
    
    bool perform_simple_union(const polygon& p1, const polygon& p2) {
        std::vector<polygon> output; 
        bg::union_(p1, p2, output);
        return output.size() == 1;
    }
    
    Number find_epsilon(Number left, Number right) {
    
        Number eps = (left + right) / 2;
        if (right - left < std::numeric_limits<Number>::epsilon())
            return left;
    
        polygon a = create_poly(
            std::vector<std::tuple<Number, Number>>{
                {1.0, 1.0}, { 2.0,1.0 }, { 2.0, 2.0 }, { 1.0,2.0 }
            }
        );
    
        polygon b = create_poly(
            std::vector<std::tuple<Number, Number>>{
                {2.0 + eps, 1.0}, { 3.0 + eps, 1.0 }, { 3.0 + eps, 2.0 }, { 2.0 + eps,2.0 }
            }
        );
    
        if ( perform_simple_union(a, b) ) {
            return find_epsilon(eps, right);
        } else {
            return find_epsilon(left, eps);
        }
    }
    
    int main()
    {
        std::cout << "nextafter(0, 1):  " << nextafter(Number(0), Number(1)) << "\n";
        std::cout << "Number: eps()     " << std::numeric_limits<Number>::epsilon()      << "\n";
        std::cout << "Number: min_exp() " << std::numeric_limits<Number>::min_exponent10 << "\n";
        std::cout << "Number: max_exp() " << std::numeric_limits<Number>::max_exponent10 << "\n";
        std::cout << "Number: min()     " << std::numeric_limits<Number>::min()          << "\n";
        std::cout << "Number: max()     " << std::numeric_limits<Number>::max()          << "\n";
    
        auto eps = find_epsilon(0.0, 1.0);
    
        std::cout << std::setprecision(180);
        std::cout << "eps == " << eps << "\n";
    
        std::cout << std::boolalpha;
        std::cout << "zero? " << (eps == 0) << "\n";
    }
    

    Prints

    nextafter(0, 1):  1e-67108864
    Number: eps()     1e-49
    Number: min_exp() -67108864
    Number: max_exp() 67108864
    Number: min()     1e-67108864
    Number: max()     1e+67108864
    eps == 0
    zero? true
    

    For cpp_dec_float<0> it prints (note the "weird" numeric_limits<>::eps` in the variable-precision situation):

    Live On Coliru

    nextafter(0, 1):  1e-67108864
    Number: eps()     1e-08
    Number: min_exp() -67108864
    Number: max_exp() 67108864
    Number: min()     1e-67108864
    Number: max()     1e+67108864
    eps == 0
    zero? true