geometry line computational-geometry boost-geometry line-segment

How can I use 2D infinite lines as keys of an associative container that can be queried by proximity?

I have a thousands of line segments that I'd like to cluster by colinearity. One way to do this is to make an associative container with keys that are infinite lines. With such a container I could use a collection of line segments as values and add a line segment by determining the infinite line of which it is a segment and inserting into the corresponding bin.

Given such a set up, what is the best way to characterize the infinite lines for supporting the ability to query the data structure for line keys that are near a given line?

For example I was thinking of using an R-tree of points (Elsewhere in this project I am already using Boost.Geometry R-trees) where each point is the x-intercept and y-intercept of an infinite line. However, this only works for non-vertical and non-horizontal lines. I could handle vertical and horizontal lines as special cases but then I would not be able to easily query for lines that are "near" a vertical or horizontal line the way that I will be able to query for lines that are near a non-axis aligned line by doing a 2D range query of the intercept points in the R-tree.

I'm wondering if there is some elegant way of handling this problem. How can I represent infinite 2D lines as points such that horizontal and vertical lines are no different than any other kind of line and such that lines that are near each other map to points that are near each other?

Solution

I have two solutions. The first is a simple one with some limitations:

For each infinite line, you could compute the point on the line where the perpendicular drawn from the origin meets the line. You could store the coordinates of this point as a "signature" of that line. This solution will work for all lines except those that pass through the origin. That is because when the line passes through the origin, the "signature" point will always be the origin no matter the slope of the line.

The second solution extends the first one to solve that problem: In addition to the coordinates of the point described above, you can also store the angle the normal of the line makes with the x-axis. So you'd be representing each line with an ordered triplet (x, y, theta). You can store these triplets in an rtree for 3d points and query that tree.

Two lines that pass through the origin could have a theta value of pi/4 radians and 5*pi/4 respectively. They'd be coincident, but the way they are stored in the rtree doesn't reflect that. So just for the lines that pass through the origin, you could enforce a convention, say - theta must be between 0 and pi. Such a convention would fix the problem. This convention should only be enforced for lines that pass through the origin.

Update:

Coming up with a solution that is better optimized for your use-case will require a clear definition of how you measure the "proximity" between two infinite lines.