Search code examples
rdistancepointsr-sf

How to measure distance between spatial points by groups in sf R package


I have a dataset with animal records in different countries through a period of 13 years, the x y coordinates (in meters) and the year of the animal sighting. I want to know how to measure the distance in meters between the three closest neighbors from each point, but measuring this distance only within the points of the same year of survey. I want to find out which pairs of locations are closer than 1000m to each other and get a csv of these.

I am using package sf and I was able to estimate the distance between points but not sub-setting by year. I found that the package 'spatstat' could do this with 'nndist', however I found it complicated to set up the requested window and understanding the point pattern objects I need to work with. I am a newbie in R and it gets very complicated for me to be working with different type of objects and different packages, so I wonder if there is an easy way to do this in sf. I am open to suggestions on how to do this most efficiently though, if you suggest using another package or another way, please help me providing a code for transforming my data into the type of object needed for that package.

Thank you!

trial <-read.table(text =
"Country    station_code    lat_laea    lon_laea    year
Belize  BF09-1  -2955950    1247610 2009
Belize  BF09-10 -2953600    1248590 2009
Belize  BF09-11 -2954620    1247900 2009
Belize  BF11-13 -2958360    1244020 2011
Belize  BF11-18 -2963740    1240290 2011
Belize  BF11-19 -2963380    1242020 2011
Costa   BraulioCarrilloNP-C16   -3640760    1821170 2011
Costa   BraulioCarrilloNP-C17   -3640730    1823240 2011
Costa   BraulioCarrilloNP-C18   -3642140    1817560 2011
Guatemala   40063   -3178260    1249780 2009
Guatemala   40596   -3183800    1246940 2009
Guatemala   43279   -3182640    1251560 2009", 
                   header = TRUE)

trial.sp <- st_as_sf(trial, coords = c("lat_laea", "lon_laea"), crs = 3035)
plot(st_geometry(trial.sp))

test <- st_distance(trial.sp, trial.sp, by_element = FALSE, which = "Euclidean")
class(test)

test1 <- as.data.frame(test)
write_csv(test1, "test.csv")

What I would like is a csv file with all the pair of locations of the same year of study, which are closer than 1000m from each other.


Solution

  • The sf-package easily integrates into the dplyr-framework. This creates a circle with a radius of 1000 m around every point in trial.sp and joines it to the original dataset via the st_within-function.

    trial.sp %>%
      st_join(trial.sp %>% st_buffer(dist = 1000), join = st_within) %>%
      filter(year.x == year.y) 
    
    

    This includes all stations that have no other station in 1000 m distance. They are just joined to itself. If you want to remove those you can do : filter(year.x == year.y & station_code.x != station_code.y)