Search code examples
rdataframedistancelatitude-longitudeminimum

In R, how do I identify which two rows have minimum distance in my (x,y) dataframe?


In R, I have a dataframe with x,y as lat,long. How do I find which rows get the minimum distance and assign a number in a new column to show this? An simple example below shows the two rows, (5,3) and (5,2), that have a minimum distance and Column C gives them the same number grouping.

df


Solution

  • I guess you may need distm from package library(geosphere)

    library(geosphere)
    xy <- setNames(data.frame(rbind(c(0,0),c(90,90),c(10,10),c(-120,-45))),c("lon","lat"))
    d <- distm(xy)
    inds <- which(min(d[d>0])==d,arr.ind = TRUE)
    out <- cbind(xy,C = NA)
    out$C[inds[,"row"]] <- 1
    

    which gives

    > out
       lon lat  C
    1    0   0  1
    2   90  90 NA
    3   10  10  1
    4 -120 -45 NA
    

    Dummy data

    > dput(xy)
    structure(list(lon = c(0, 90, 10, -120), lat = c(0, 90, 10, -45
    )), class = "data.frame", row.names = c(NA, -4L))