See I have two dataframes, both with lat and long. My idea is to compare lat and long of both dataframes. If you see, for example, we have for Market1
of dataframe x
, latitude equal to -22.89290
and longitude equal to -48.45048
, which is similar (not equal) to the third line of dataframe y
, which has a latitude of -22.89287
and a longitude of -48.45075
. You can see that this happens in the other Markets
. Note that the first 5 numbers are the same, in this case it was -22.892
for lat and -48.450
for long. Therefore, if they are the same, insert the names of the Marketname
column of the dataframe x
into the Marketname
column of the dataframe y
, as shown in Expected output
x<-structure(list(Latitude = c(-22.8928950233225, -22.929618895716,
-22.8773751675936), Longitude = c(-48.4504779883472, -48.4515412645053,
-48.4364903609011), Marketname = c("Market1", "Market2",
"Market3")), row.names = c(NA, 3L), class = "data.frame")
> x
Latitude Longitude Marketname
1 -22.89290 -48.45048 Market1
2 -22.92962 -48.45154 Market2
3 -22.87738 -48.43649 Market3
y<- structure(list(lat = c(-22.8715263, -22.8774825, -22.8928723,
-22.9295906), lng = c(-48.440335, -48.4364964, -48.4507477, -48.4516264
), Marketname = c("Market0","0", "0", "0")), row.names = c(NA, 4L), class = "data.frame")
> y
lat lng Marketname
1 -22.87153 -48.44033 Market0
2 -22.87748 -48.43650 0
3 -22.89287 -48.45075 0
4 -22.92959 -48.45163 0
Expected output
> y
lat lng Marketname
1 -22.87153 -48.44033 Market0
2 -22.87748 -48.43650 Market3
3 -22.89287 -48.45075 Market1
4 -22.92959 -48.45163 Market2
Since you seem to need the shortest distance to each market based on geocoordinates on an ellipsoid, you should use something like geosphere::distGeo
. To compare each coordinat of y
with each of x
we may Vectorize
the function and use an outer
approach. Note, that we don't want the first column, so we remove first element of the sequence [1,...n] seq_len(nrow(y))[-1]
f <- Vectorize(\(i, j) geosphere::distGeo(y[i, 2:1], x[j, 2:1]))
u <- outer(seq_len(nrow(y))[-1], seq_len(nrow(x)), f) |> apply(1, which.min)
# [1] 3 1 2
Insert into x
like so:
y[-1, 3] <- x[u, 3]
# lat lng Marketname
# 1 -22.87153 -48.44033 Market0
# 2 -22.87748 -48.43650 Market3
# 3 -22.89287 -48.45075 Market1
# 4 -22.92959 -48.45163 Market2