Search code examples
pythonpython-3.xshapesdata-fittingbest-fit

Determine most accurate best-fit algorithm for scattered points


I'm currently working with the skspatial package in Python3. I currently have two functions:

  1. skspatial.objects.Cylinder.best_fit to try to best fit the points into a cylinder
  2. A function that returns the bounding box of the scattered points. This function returns a custom class that should "extend" the skspatial.objects with a Cuboid object with similar methods, but in reality is only a bounding box.

My question is the following: Is there a way to determine which of these two shapes better encompass the scattered points? They both return valid cylinder or Cuboid objects on their own for the same set of scattered points, but only one of them actually fits the points better.

In case it matters for this question, the scattered points only represent the surface of whatever shape they actually are on, so there are no points "inside" the object.


Solution

  • I've decided to test my own metric with the explanation I gave in the comments above.

    The approach is as follows: Assuming we know that either shape (cylinder or cuboid) is the goal (and technically, all shapes that can be point-reflected/mirrored in 3D space given its center coordinate), then we can determine the percentage of points that are not on the surface of either shape:

    1. Define an error counter variable errors = 0.

    2. Determine the center of the shape.

    For the cylinder, it would be Point(cylinder.point + cylinder.vector * 0.5). For the cuboid, it would be the middle of the line from the (x_min, y_min, z_min) point to the (x_max, y_max, z_max) point. In skspatial, this would be done by Line.from_points(p_min, p_max).to_point(0.5).

    1. Main loop: For each point in the point cloud, draw a line from the previously calculated center point to the individual point:
    point_center_line = Line.from_points(center, point)
    
    1. Obtain the two intersection points that are returned when intersecting the point_center_line with the cylinder or cuboid shape. Since our shapes are "hollow", and the line is going through the center of the shape, there will always be 2 intersection points a and b:
    a, b = shape.intersect_line(point_center_line)
    
    1. Construct a line from these two points a, and b. This line has the length 2 * (length_of_point_center_line +- delta). delta is the length of the line from a or b (whichever is closer) to the individual point. The +- is because that point is either still within the shape itself or just outside.

    We now can check if delta is within the specified tolerance or not:

    delta = abs(length_of_point_center_line - length_of_a_b_line / 2)
    if delta > tolerance:
        errors += 1
    
    1. Loop end. Now, we can calculate the percentage: errors / number_of_points

    This will give for a point cloud, that actually is from a cylinder shaped object, the following output in my use case:

    Fitting error for Cuboid.best_fit: 0.951777280636341
    Fitting error for Cylinder.best_fit: 0.0019885657469550086
    

    Since the cuboid completely fits the cylinder point cloud, there are only a few points where the cylinder points actually are on the surface of the cuboid. However, for the cylinder, most points are within the tolerance to the surface, and only 0.19% of points are outside that tolerance.

    With this, I can simply check for the return value and take the function that has the lower percentage.