Search code examples
algorithmoptimizationcurve-fittingcomputational-geometry

Fitting two lines to a set of 2D points


Given a set of 2D points (black dots in the picture) I need to choose two lines to somehow represent those points. I probably need to minimize the sum of squares of [distance of x to the closer of two lines]^2, although if any other metric makes this easier, this is fine too.

The obvious but ineffective approach is to try min squares over all 2^n partitions. A practical approach is probably iterative improvement, maybe starting with a random partition.

Is there any research on algorithms to handle this problem?

two lines


Solution

  • Two related concepts from the literature come to mind.

    First, my intuition is that there should be a way to interpret this problem as estimating the parameters for a mixture model. I haven't worked out the details, since parameter estimation typically uses expectation--maximization, and I can just describe how that would work: initialize a partition into two parts, then alternately run a regression on each part and reassign points based on their distance to the new regression lines.

    Second, assuming that the input is relatively clean, you should be able to get a good initial partition using RANSAC. For some small k, take two disjoint random samples of k points and fit lines through them, then assign every other point. For a (100 − x)% chance of success you'll want to repeat this about ln(100/x) × 22k−1 times and take the best one.