Search code examples
rggplot2usmap

Plotting old Connecticut county regions in USMAP


Recently, the American Census Bureau changed the geographical entities reported in the state of Connecticut, as discussed here. Previously, data was reported in the following 8 counties:

CT_counties <- data.frame(county = c("Fairfield County, CT",
                                 "Hartford County, CT",
                                 "Litchfield County, CT",
                                 "Middlesex County, CT",
                                 "New Haven County, CT",
                                 "New London County, CT",
                                 "Tolland County, CT",
                                 "Windham County, CT"),
                      fips = c("09001",
                               "09003",
                               "09005",
                               "09007",
                               "09009",
                               "09011",
                               "09013",
                               "09015"))

As usmap has been updated to facilitate this change, the new 9 planning regions used are:

CT_regions <- data.frame(region = c("Capitol Planning Region",
                             "Greater Bridgeport Planning Region",
                             "Lower Connecticut River Valley Planning Region",
                             "Naugatuck Valley Planning Region",
                             "Northeastern Connecticut Planning Region",
                             "Northwest Hills Planning Region",
                             "South Central Connecticut Planning Region",
                             "Southeastern Connecticut Planning Region",
                             "Western Connecticut Planning Region"),
                  fips = c("09110",
                           "09120",
                           "09130",
                           "09140",
                           "09150",
                           "09160",
                           "09170",
                           "09180",
                           "09190"))

Because usmap uses the new shapefiles, the following code works:

library(usmap)
plotdf <- data.frame(fips = CT_regions$fips, vals = rnorm(9))
plot_usmap(data = plotdf, values = "vals",
       include = c("MA", "CT", "RI"),
       labels = FALSE)

Producing the following image: enter image description here

But, since a lot of data is reported at the older county level, it is useful to still be able to plot those. However, this no longer works. The following code gives nothing useful:

library(usmap)
plotdf <- data.frame(fips = CT_counties$fips, vals = rnorm(8))
plot_usmap(data = plotdf, values = "vals",
       include = c("MA", "CT", "RI"),
       labels = FALSE)

Question: Using usmap, is there any easy way to plot the counties previously used for Connecticut (CT_counties) ? Particularly a way that does not involve going back to previous versions of usmap. As it is useful to be able to do both things in the same script.


Solution

  • To elaborate on my comment, the workflow for creating choropleth maps for 2021 and years prior to Connecticut's shift to planning regions would look something like this:

    First, we grab the TIGER shapeline files using tigris::counties function. Note that since we are going directly to the TIGER files, we won't use usmap("counties") in the workflow since it uses FIPS codes from 2022.

    library(tigris)
    library(dplyr)
    library(ggplot2)
    
    options(tigris_use_cache = TRUE)
    
    CT_county_shp <- counties(state = "CT", year = 2021)
    
    CT_counties <- data.frame(
      county = c(
        "Fairfield County, CT",
        "Hartford County, CT",
        "Litchfield County, CT",
        "Middlesex County, CT",
        "New Haven County, CT",
        "New London County, CT",
        "Tolland County, CT",
        "Windham County, CT"
      ),
      fips = c(
        "09001",
        "09003",
        "09005",
        "09007",
        "09009",
        "09011",
        "09013",
        "09015"
      ),
    # This can be any variable(s) as long as we have the correct FIPS codes
      vals = rnorm(8)
    )
    

    Next, we join our counties data frame to our counties shapefile that we downloaded from the US Census Bureau (I use tidyverse tools here, but base R or data.table works as well):

    CT_counties_clean <- left_join(
      CT_county_shp,
      CT_counties,
      by = join_by("GEOID"=="fips"),
      keep = TRUE
    ) |>
      select(
        county,
        fips,
        vals,
        geometry
      )
    

    Our CT_counties_clean data frame should also be a simple features, or sf, object (e.g., class(CT_counties_clean) returns both data.frame and sf). The geometry column is a list column containing multipolygon geometries, which is identical to the geom column of usmap("counties"). Since plot_usmap() uses ggplot() under the hood, but doesn't allow us to join our data to anything other than the 2022 counties, we can just construct our own ggplot() instead, such as:

    ggplot() +
      geom_sf(data = CT_counties_clean, aes(fill = vals)) +
      labs(title = "CT Counties (2021)") +
      theme_void()
    

    Choropleth of arbitrary values for Connecticut counties 2021

    From there, you can modify the map to your liking to add labels, custom colors, fonts, etc., as well as playing around with the map projections. usmap_crs() indicates that the default CRS for usmap objects is EPSG 9311 coordinate reference system, whereas fetching CT counties from tigris and running sf::st_crs() shows that the TIGER files are in the EPSG 4269 coordinate reference system. You'll want to ensure that you are using the same CRS for your 2021 and 2022 CT county maps. Hope this helps you along!