c++floating-point precision floating-accuracy convex-hull

how can you practically solve the convex hull when there are floating precision issues?

Suppose that you have 100000 points on the curve y = x^2. You want to find the convex hull of these points. All the coordinates are floating numbers.

In my graham scan implementation the only place where I operate on floating numbers is when I initially sort all the points by their coordinates and then I have one function that determines whether three points make a left or a right turn.

Points:

struct point {
   double x; 
   double y;
};

Sorting comparator:

inline bool operator() (const point &p1, const point &p2) {
    return (p1.x < p2.x) || (p1.x == p2.x && p1.y > p2.y);
}

Left/Right turn:

inline int ccw(point *p1, point *p2, point *p3) {
  double left  = (p1->x - p3->x)*(p2->y - p3->y);
  double right = (p1->y - p3->y)*(p2->x - p3->x);
  double res = left - right;
  return res > 0;
}

My program says that out of 100 000 points only 68894 are part of the convex hull. But since they are on the curve, all of them should be part of the convex hull.

For your eye it doesn't make any difference. See figure below. The red points are part of the convex hull.

But if you look close enough, and zoom into the points, you'll see that some of them are blue, so they are not included in the convex hull.

Now my initial assumption is that floating point errors are causing this problem.

I guess I could use an external library that has an arbitrary precision for floating point numbers, but I'm more interested in the simple data types that we have for example in C++.

How could I increase the accuracy? I've read about epsilons, but how would using an epsilon help here? I would still assume some points that are close to each other to be the same, so I won't get an accuracy closer to 100%.

What's the best way to approach this problem?

Solution

You are correct that all of the points should be on the convex hull if you are indeed using points of the form (x, x^2). However, three points may be collinear. If you're shifting them or doing anything else weird, this goes out the window.

If you get to choose your 100000 points, I would suggest using the integers in [-50000,49999]. Your ccw function will compute left and right to be integers smaller in absolute value than 2.5e14 < 2^53, so no roundoff will occur.

Your coordinate-based sort will work correctly regardless of the input.

For general inputs, the following ccw predicate is buggy:

inline int ccw(point *p1, point *p2, point *p3) {
  double left  = (p1->x - p3->x)*(p2->y - p3->y);
  double right = (p1->y - p3->y)*(p2->x - p3->x);
  double res = left - right;
  return res > 0;
}

There can be roundoff both in the subtractions and in the multiplications. If all of your points lie in a H*W bounding box, the x-coordinate differences will be computed with an absolute error around H*eps/2 and the y-coordinate differences will be computed with an absolute error around W*eps/2. The products will therefore be computed with an absolute error around H*W*eps/2. If fabs(left - right) < 3*H*W*eps/2, you need to evaluate left and right more precisely. eps here is 2^-52.

I'd probably recommend just using MPFR if the double comparison doesn't tell you anything. You can do it without, however. The trick from Kahan summation will get you the low bits from the differences, and the 2²⁷+1 trick can help you compute the products exactly.