Search code examples
rshapefiler-sfr-sp

sf package read shapefile for st_within


I'm trying to use st_within to find points within a polygon, but I can't get a shapefile into the right format. I'm using the standard shp file JPN_adm1.shp from https://purl.stanford.edu/np872wp5062 or https://gadm.org/download_country.html

JPN <- st_read("JPN_adm1.shp")
JPN <- JPN[1,] ## select one prefecture, Aichi
JPN <- st_transform(JPN, crs="WGS84")

tokyo.df <- data.frame("name"="Tokyo","lat"=35.507,"lon"=139.208)
some_point.sf <- st_as_sf(tokyo.df, coords = c("lon","lat"), crs="WGS84")
some_point.sf <- st_transform(some_point.sf, crs="WGS84")
subset <- st_join(some_point.sf, JPN, join = st_within)

The code runs, but it doesn't exclude Tokyo which isn't in Aichi. What's wrong?


Solution

  • First, you do not need to use the lines of code starting with st_transform() because the JPN and some_point.sf objects are already in WGS84

    Secondly, not sure to fully understand your problem, but I give it a try! I guess you just need to set the argument left = FALSE inside the st_join() function. Otherwise st_join() performs a left join by default (instead of a inner join which is required in your specific case).

    Please find below a reprex:

    Reprex

    • Code
    library(sf)
    
    JPN <- st_read("JPN_adm1.shp")
    JPN <- JPN[1,] ## select one prefecture, Aichi
    # JPN <- st_transform(JPN, crs="WGS84")            # NO NEED TO DO THAT
    
    tokyo.df <- data.frame("name"="Tokyo","lat"=35.507,"lon"=139.208)
    some_point.sf <- st_as_sf(tokyo.df, coords = c("lon","lat"), crs="WGS84")
    # some_point.sf <- st_transform(some_point.sf, crs="WGS84")       # NO NEED TO DO THAT
    subset <- st_join(some_point.sf, JPN, join = st_within, left = FALSE) # ADD ARGUMENT "LEFT = FALSE"
    
    • Output

    You get an empty sf object as expected:

    subset
    #> Simple feature collection with 0 features and 13 fields
    #> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
    #> Geodetic CRS:  WGS 84
    #>  [1] name      ID_0      ISO       NAME_0    ID_1      NAME_1    HASC_1   
    #>  [8] CCN_1     CCA_1     TYPE_1    ENGTYPE_1 NL_NAME_1 VARNAME_1 geometry 
    #> <0 lignes> (ou 'row.names' de longueur nulle)
    

    Created on 2022-02-04 by the reprex package (v2.0.1)