Search code examples
rgeospatialdata-manipulationlookupshapefile

(R) : Lookup Tables Using Geographical Coordinates?


I have the following shapefile in R and created this map of eastern United States.

library(sf)  
library(leaflet)
library(leafgl)
library(colourvalues)
library(leaflet.extras)


nc <- st_read(system.file("gpkg/nc.gpkg", package="sf"), quiet = TRUE) %>% 
  st_transform(st_crs(4326)) %>% 
  st_cast('POLYGON')

The shapefile looks something like this:

> nc
Simple feature collection with 108 features and 14 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -84.32377 ymin: 33.88212 xmax: -75.45662 ymax: 36.58973
Geodetic CRS:  WGS 84
First 10 features:
     AREA PERIMETER CNTY_ CNTY_ID        NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79                           geom
1   0.114     1.442  1825    1825        Ashe 37009  37009        5  1091     1      10  1364     0      19 POLYGON ((-81.47258 36.2344...
2   0.061     1.231  1827    1827   Alleghany 37005  37005        3   487     0      10   542     3      12 POLYGON ((-81.23971 36.3654...
3   0.143     1.630  1828    1828       Surry 37171  37171       86  3188     5     208  3616     6     260 POLYGON ((-80.45614 36.2426...
4   0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 POLYGON ((-76.00863 36.3196...
4.1 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 POLYGON ((-76.02682 36.5567...
4.2 0.070     2.968  1831    1831   Currituck 37053  37053       27   508     1     123   830     2     145 POLYGON ((-75.90164 36.5562...
5   0.153     2.206  1832    1832 Northampton 37131  37131       66  1421     9    1066  1606     3    1197 POLYGON ((-77.21736 36.2410...
6   0.097     1.670  1833    1833    Hertford 37091  37091       46  1452     7     954  1838     5    1237 POLYGON ((-76.74474 36.2339...
7   0.062     1.547  1834    1834      Camden 37029  37029       15   286     0     115   350     2     139 POLYGON ((-76.00863 36.3196...
8   0.091     1.284  1835    1835       Gates 37073  37073       37   420     0     254   594     2     371 POLYGON ((-76.56218 36.3406...

And the map looks something like this:

leaflet(data = nc) %>% addPolygons( stroke = FALSE) %>% addTiles(group = "OSM") %>%  addProviderTiles(provider = providers$OpenStreetMap) %>% addPolygons(data = nc, weight=1, popup = ~NAME,
                label = ~NAME, group = "name", col = 'blue') %>% 
    addSearchFeatures(targetGroups  = 'name', options = searchFeaturesOptions(zoom=10, openPopup=TRUE))

enter image description here

One of the locations in this map is "Durham". I know that "Duke University" (https://en.wikipedia.org/wiki/Duke_University) is located in Durham. I know that the geographical coordinates of Duke University is (36.001660130419914, -78.93825005950572).

I have the following question: Suppose I am given the coordinates (36.001660130419914, -78.93825005950572) - can I find out which "NAME" (relative to the shapefile) that this coordinate is located within?

head(nc$NAME)
  [1] "Ashe"         "Alleghany"    "Surry"        "Currituck"    "Currituck"    "Currituck"    "Northampton"  "Hertford"     "Camden" 

As an example - suppose I have the following coordinates:

id = c(1,2,3)
lat = c(36.001660130419914, 35.78488950612783, 35.252238310927275)
long = c(-78.93825005950572,  -78.6820624293097, -80.84523522326482) 
  
my_data = data.frame(id, lat, long)

  id      lat      long
1  1 36.00166 -78.93825
2  2 35.78489 -78.68206
3  3 35.25224 -80.84524

Can I find out which "NAME" each of these points is situated in? For example:

  id      lat      long   name
1  1 36.00166 -78.93825 Durham
2  2 35.78489 -78.68206    etc
3  3 35.25224 -80.84524    etc

Is it somehow possible to create a "lookup table" that works based on geographical coordinates?

Thank you!


Solution

  • You can use st_intersects.

    First you have to make your points sf in the same CRS as the shapefile. If you're ever unsure of the CRS, you can check it:

    st_crs(nc)
    

    This prints out a slew of information, but the first and last item is the CRS.

    This is how I did it... I do want to point out that I needed to reverse the order of x and y to make this work (it's in the code).

    lapply(1:nrow(my_data), 
           function(j) {
             first = st_sfc(st_point(rev(unlist(my_data[j, 2:3]))), crs = 4326)
             nc[which(st_intersects(first, nc, sparse = F)), ]$NAME
           })
    

    Likely, you don't want just to print this to the console, so perhaps you would do something like this.

    my_data$region <- lapply(1:nrow(my_data), 
                             function(j) {
                               first = st_sfc(st_point(rev(unlist(my_data[j, 2:3]))), 
                                              crs = 4326)
                               nc[which(st_intersects(first, nc, sparse = F)), ]$NAME
                             }) %>% unlist()
    
    # returns 
    # [1] "Durham"      "Wake"        "Mecklenburg"