Search code examples
rdictionaryggplot2usmap

Putting Values on a County Map in R


I am using an excel sheet for data. One column has FIPS numbers for GA counties and the other is labeled Count with numbers 1 - 5. I have made a map with these values using the following code:

library(usmap)  
library(ggplot2)  
library(rio)  
carrierdata <- import("GA Info.xlsx")  
plot_usmap( data = carrierdata, values = "Count", "counties", include = c("GA"), color="black") +  
    labs(title="Georgia")+  
    scale_fill_continuous(low = "#56B1F7", high = "#132B43", name="Count", label=scales::comma)+  
    theme(plot.background=element_rect(), legend.position="right")  

I've included the picture of the map I get and a sample of the data I am using. Can anyone help me put the actual Count numbers on each county?
Thanks!
Data GA Map


Solution

  • The usmap package is a good source for county maps, but the data it contains is in the format of data frames of x, y co-ordinates of county outlines, whereas you need the numbers plotted in the center of the counties. The package doesn't seem to contain the center co-ordinates for each county.

    Although it's a bit of a pain, it is worth converting the map into a formal sf data frame format to give better plotting options, including the calculation of the centroid for each county. First, we'll load the necessary packages, get the Georgia data and convert it to sf format:

    library(usmap)
    library(sf)
    library(ggplot2)
     
    d   <- us_map("counties")
    d   <- d[d$abbr == "GA",]
    GAc <- lapply(split(d, d$county), function(x) st_polygon(list(cbind(x$x, x$y))))
    GA  <- st_sfc(GAc, crs = usmap_crs()@projargs)
    GA  <- st_sf(data.frame(fips = unique(d$fips), county = names(GAc), geometry = GA))
    

    Now, obviously I don't have your numeric data, so I'll have to make some up, equivalent to the data you are importing from Excel. I'll assume your own carrierdata has a column named "fips" and another called "values":

    set.seed(69)
    carrierdata <- data.frame(fips = GA$fips, values = sample(5, nrow(GA), TRUE))
    

    So now we left_join our imported data to the GA county data:

    GA <- dplyr::left_join(GA, carrierdata, by = "fips")
    

    And we can calculate the center point for each county:

    GA$centroids <- st_centroid(GA$geometry)
    

    All that's left now is to plot the result:

    ggplot(GA) + 
      geom_sf(aes(fill = values)) + 
      geom_sf_text(aes(label = values, geometry = centroids), colour = "white")
    

    enter image description here