Search code examples
rpointseuclidean-distance

Closest pair for any of a huge number of points


We are given a huge set of points in 2D plane. We need to find, for each point the closest point within the set. For instance suppose the initial set is as follows:

 foo <- data.frame(x=c(1,2,4,4,10),y=c(1,2,4,4,10))

The output should be like this:

 ClosesPair(foo)
 2
 1
 4
 3
 3 # (could be 4 also)

Any idea?


Solution

  • The traditional approach is to preprocess the data and put it in a data structure, often a K-d tree, for which the "nearest point" query is very fast.

    There is an implementation in the nnclust package.

    library(nnclust)
    foo <- cbind(x=c(1,2,4,4,10),y=c(1,2,4,4,10))
    i <- nnfind(foo)$neighbour
    plot(foo)
    arrows( foo[,1], foo[,2], foo[i,1], foo[i,2] )