I have two data sets, A and B, which give locations of different points in the UK as such:
A = data.frame(reference = c(C, D, E), latitude = c(55.32043, 55.59062, 55.60859), longitude = c(-2.3954998, -2.0650243, -2.0650542))
B = data.frame(reference = c(C, D, E), latitude = c(55.15858, 55.60859, 55.59062), longitude = c(-2.4252843, -2.0650542, -2.0650243))
A has 400 rows and B has 1800 rows.
For all the rows in A, I would like to find the shortest distance in kilometers between a point in A and each of the three closest points in B, as well as the reference and coordinates in lat and long of these points in B.
I tried using this post
However, even when I follow all the instructions, mainly using the command distm
from the package geosphere
, the distance comes up in a unit that can't possibly be kilometers. I don't see what to change in the code, especially since I am not familiar at all with the geo
packages.
Here is solution using a single loop and vectorizing the distance calculation (converted to km).
The code is using base R's rank
function to order/sort the list of calculated distances.
The indexes and the calculated distances of the 3 shortest values are store back in data frame A.
library(geosphere)
A = data.frame(longitude = c(-2.3954998, -2.0650243, -2.0650542), latitude = c(55.32043, 55.59062, 55.60859))
B = data.frame(longitude = c(-2.4252843, -2.0650542, -2.0650243), latitude = c(55.15858, 55.60859, 55.59062))
for(i in 1:nrow(A)){
#calucate distance against all of B
distances<-geosphere::distGeo(A[i,], B)/1000
#rank the calculated distances
ranking<-rank(distances, ties.method = "first")
#find the 3 shortest and store the indexes of B back in A
A$shortest[i]<-which(ranking ==1) #Same as which.min()
A$shorter[i]<-which(ranking==2)
A$short[i]<-which(ranking ==3)
#store the distances back in A
A$shortestD[i]<-distances[A$shortest[i]] #Same as min()
A$shorterD[i]<-distances[A$shorter[i]]
A$shortD[i]<-distances[A$short[i]]
}
A
longitude latitude shortest shorter short shortestD shorterD shortD
1 -2.395500 55.32043 1 3 2 18.11777 36.633310 38.28952
2 -2.065024 55.59062 3 2 1 0.00000 2.000682 53.24607
3 -2.065054 55.60859 2 3 1 0.00000 2.000682 55.05710
As M Viking pointed out, for the geosphere package the data must be arranged Lon then Lat.