Search code examples
rr-sf

Check whether point coordinate lies within polygon


I'm new to spatial data and need some help. I have a set of 1million points and would like to identify which of those are located within one of 5 polygons. Once identified, I would like to delete the entire row of that point.

The polygons are found within a kml file and the points in a csv. After reading the data, I did the following:

library(sf)
library(dplyr)

#create sf of lat long
heat_df$point <- heat_df[,5:6] %>% 
  as.data.frame %>% 
  st_as_sf(coords = c(1,2)) %>%
  st_set_crs(4326)

#make planar
heat_df$point <- st_transform(heat_df$point, 2163)
kml$geometry <- st_transform(kml$geometry, 2163)

#iterate over the following 5 times (once per polygon)
heat_df$inter <- st_intersects(heat_df$point, kml$geometry[1], sparse = FALSE)
heat_df <-  heat_df[!(heat_df$inter == TRUE),]

However, I couldn't find any points in those polygons, even though I know that there are points in those polygons. I checked the data frames and noticed that the coordinates are apparently formatted differently:

> print(heat_df$point[1])
[[1]]
[1] 6407800 9211903
attr(,"class")
[1] "XY"    "POINT" "sfg"  

> print(kml$geometry[1])
[[1]]
[[1]]
         [,1]    [,2] [,3]
 [1,] 4520903 5043254    0
 [2,] 4520945 5043244    0
 [3,] 4521016 5043207    0
 [4,] 4521029 5043312    0
 [5,] 4521027 5043325    0
 [6,] 4521016 5043341    0
 [7,] 4520962 5043405    0
 [8,] 4520926 5043388    0
 [9,] 4520903 5043254    0

attr(,"class")
[1] "XYZ"     "POLYGON" "sfg"    

At least the coordinates in heat_df$point[1] are very different from the ones in kml$geometry[1]. All points and all polygons are located within 1km of each other. So I don't expect point coordinates like [1] 6407800 9211903 and polygon coordinates like [1,] 4520903 5043254 0. But maybe I'm wrong. I haven't worked with spatial data before. Can you help me figure out what's wrong? I want to stick to the sf package, if possible.


Solution

  • The class of problems you describe is called point in polygon.

    You can handle it via sf::st_join() - it will add polygon columns to your point dataset. It uses left join as default = preserves rows of left hand object. It is therefore the best to start with your points object, and add to that characteristics of your polygons when spatially aligned (and NAs when not).

    Two things to keep in mind:

    • the CRS system has to be aligned (which one you choose matters rarely; it would take an extreme edge case - but it has to be aligned; use st_transform for that)
    • depending on the class of your points object (sfc vs. sf) you may need to call st_as_sf() first

    Consider something along these lines:

    pip <- points %>%
      st_join(heat_df)