Search code examples
ralgorithmcomputational-geometry

how to exclude the existence of a line 'segregating' a set of points


Apologies for the vague title: I can't find a proper name for the theory I am looking for (that's why I am asking the question), so I'm going to explain it with an example, and I hope someone can point me in the right direction.

Suppose you have a set of points in 2D.
The following R code:

# make a random set of N points in 2D space as a numerical matrix
set.seed(1010)
d = 2
N = 15
ps <- matrix(rnorm(d*N), , d)

# center the points (subtract the mean of each coordinate)
pss <- scale(ps,scale=F)

# represent the points in a 2D plot, with the origin (the new mean) in red
plot(pss)
text(pss,label=1:N,pos=4)
points(0,0,col=2,pch=16)
text(0,0,label=0)
abline(v=0)
abline(h=0)

should make a plot like:
enter image description here

Consider point 7. Intuitively one can see that there are several possible lines passing through point 7, which 'leave' all other points on 'one side' of the line (i.e. 'segregate' them in the half-plane defined by the line).

Consider instead point 6. There can never be any line passing through point 6 for which one half-plane contains all the points.

A point like 9 can also have such a line, although it's not particularly evident from the plot.

Question: is there any way to exclude the existence of such a line for each specific point? Meaning, could one do some operations on the coordinates of the points proving that such a line can NOT exist for a given point (so one could quickly classify the point into one that can and or can't have it)? I'm also thinking of higher dimensional applications, where lines would be planes, etc.

All my searches on the topic so far took me to concepts like 'convex hull', and 'boundary', which seem indeed quite closely related to what I'm looking for, but go far beyond my simple requirement of classifying the points, and are reported to be 'output-sensitive', indeed because they provide a lot of information on the hull itself, which I do not need.

Any ideas?

Thanks!


Solution

  • Given a set of points, the following two statements about an individual point p are equivalent:

    • There is a line through p such that every point in the set lies on the same side of the line (or on the line),
    • p is on the boundary of the set's convex hull.

    This is true because if p is in the interior of the convex hull, then any line through p divides the convex hull into two parts. If one side of the line has no points in the set, then the other side is a smaller convex region which contains every point. This would contradict the definition of the convex hull, which is the smallest convex set containing every point.

    So the set of points which satisfy the property about having a line which doesn't divide the set in two, is precisely the same set of points that are on the boundary of the convex hull. The latter is what a convex hull algorithm returns, so logically, any algorithm which solves your problem for each point in the set is a convex hull algorithm.

    The only subtle difference I can think of is that standard convex hull algorithms usually also return the boundary points in a particular order, whereas you don't need them in any particular order. But I don't think there is a more efficient convex hull algorithm when the order doesn't matter. The running time is O(n log n) in the worst case, which gives you an average query time per point of at most O(log n).

    That is asymptotically optimal for this problem if you want to test each point in the set, since computing even the unordered convex hull takes at least O(n log n) in the worst case, according to an arXiv paper by Herman Haverkort. There's evidence that this is optimal even for just finding the cardinality of the convex hull (see this paper by Davis Avis).