Search code examples
rr-sfpoint-in-polygon

Use R to identify which counties are within larger congressional districts using sf


I have one sf object with US counties, and another with US congressional districts. I need to know (i) which counties are in which congressional district, and (ii) if one county overlaps the district boundary (i.e. is contained within two districts) then I need to know which district it's in "more", or what proportion of the county lays in each.

Here are my exact sf objects:

library(USAboundaries)
library(sf)

union_states <- c("Maine", "New Hampshire", "Vermont", "New York", "Massachusetts", "Rhode Island", "Connecticut", "Pennsylvania", "New Jersey", "Ohio", "Indiana", "Illinois", "Iowa", "Wisconsin", "Minnesota", "Michigan") # only core states: exluces CA, WA, KS, and boundary states
union_sf <- us_counties(map_date = "1865-01-01", states = union_states, resolution = 'high')
union_congress_sf <- us_congressional(resolution = "low", states = union_states)

This question poses my exact question, but is a bit outdated and doesn't work with sf objects: Using R intersections to create a polygons-inside-a-polygon key using two shapefile layers


Solution

  • You can use st_join() with the largest = TRUE argument.

    For the counties and congressional districts objects, I removed most of the columns so the result is cleaner. In the output, name is the county, and geoid is the congressional district.

    library(dplyr)
    
    union_sf <- union_sf %>% 
      select(name, geometry)
      
    union_congress_sf <- union_congress_sf %>% 
      select(geoid, state_name, geometry)
    
    join <- st_join(union_sf,
                    union_congress_sf,
                    largest = TRUE)
    
    join
    
    Simple feature collection with 804 features and 3 fields
    geometry type:  MULTIPOLYGON
    dimension:      XY
    bbox:           xmin: -97.23421 ymin: 36.97353 xmax: -66.94993 ymax: 49.38437
    geographic CRS: WGS 84
    First 10 features:
               name geoid  state_name                       geometry
    2455  FAIRFIELD  0904 Connecticut MULTIPOLYGON (((-73.5055 41...
    2481   HARTFORD  0901 Connecticut MULTIPOLYGON (((-72.81354 4...
    2495 LITCHFIELD  0905 Connecticut MULTIPOLYGON (((-73.00875 4...
    2497  MIDDLESEX  0902 Connecticut MULTIPOLYGON (((-72.52454 4...
    2541  NEW HAVEN  0903 Connecticut MULTIPOLYGON (((-72.93861 4...
    2554 NEW LONDON  0902 Connecticut MULTIPOLYGON (((-72.33685 4...
    2560    TOLLAND  0902 Connecticut MULTIPOLYGON (((-72.10217 4...
    2569    WINDHAM  0902 Connecticut MULTIPOLYGON (((-71.79924 4...
    4443      ADAIR  1903        Iowa MULTIPOLYGON (((-94.24152 4...
    4444      ADAMS  1903        Iowa MULTIPOLYGON (((-94.47062 4...