Let's say I have a regular latitude/longitude grid and data at irregular locations, like this:
grid = tidyr::crossing(lon = seq(0, 1, 0.25), lat = seq(0, 1, 0.25))
data = tibble::tibble(lon = runif(4), lat=runif(4), y=rnorm(4))
How do I use, for example, dplyr::inner_join
and join_by
to join these data frames so that I get y
values from data
and corresponding lat
and lon
values from grid
from the nearest location, i.e. the grid point with smallest (grid$lon - data$lon)^2 + (grid$lat - data$lat)^2
for each row in data
?
The package sf
is made to manipulate spatial geometries; ex. points, lines, polygones. You need to convert the dataframes as sf
objects, then you can specify a spatial join st_join()
with join = st_nearest_feature
as argument.
library(sf)
library(tidyverse)
set.seed(42)
grid <- tidyr::crossing(lon = seq(0, 1, 0.25), lat = seq(0, 1, 0.25))
data <- tibble::tibble(lon = runif(4), lat = runif(4), y = rnorm(4))
grid_sf = st_as_sf(grid , coords =c("lon","lat"))
data_sf = st_as_sf(data , coords =c("lon","lat"))
joined = st_join(grid_sf, data_sf, join = st_nearest_feature)
ggplot() + geom_sf(data= joined, aes(col = y))+
geom_sf(data= data_sf, aes(col = y, fill = y),size= 4, shape = 22)
Created on 2024-07-12 with reprex v2.1.0