Search code examples
rggplot2mapsvisualization

Cloropleth using ggplot


I've been trying to plot a cloropleth map. I've made some progress, but I'm stuck now. The idea is to plot the map of Brazil and color code each state bases on the number of violent intentional deaths in 2021.

I'm using the following dataset:

State <- c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP")
VID <- c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8)
VIDBrazil <- data.frame(State, VID)

The geoJSON data file from Brazil I'm sourcing from this Brasil JSON file

The code I'm trying to use is based on resources found here, especially here, here, here and here

The code I'm using is the following:

State <- c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP")
VID <- c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8)
VIDBrazil <- data.frame(State, VID)

spdf_Brasil <- geojson_read("C:/Users/Giovanni/Desktop/R/brazil_geo.json", what = "sp") #Import JSON file

spdf_Brasil_A<- tidy(spdf_Brasil) #Convert to data frame

ggplot() + #plot the map
  geom_polygon(data = spdf_Brasil_A, aes( x = long, y = lat, group = group), fill="#69b3a2", color="white") +
  theme_void() +
  coord_map()

spdf_Brasil_MVI <- spdf_Brasil_A %>% #add the values of deaths to use for color coding the map
  left_join(. , VIDBrazil, by=c("id"="State"))


ggplot() + #ploting with color coding
  geom_polygon(data = spdf_Brasil_MVI, aes( x = long, y = lat, group = group, fill= VID), color="green") + 
  theme_void() +
  scale_fill_gradient(name = "Mortes por 100 mil habitantes", low = "#fdc0b2", high = "#9a1f03", limits = c(0, 60.0)) +
  coord_map()

Obtained map

What am I missing? What should I do to get the map color coded?

Thank you for your assistance!


Solution

  • id field in GeoJSON is not handled by broom::tidy() as-is, so there are no State codes in the resulting data frame and join will not work as intended. Though the column named id is still there, thus left_join() will not throw an error. If you check your spdf_Brasil_MVI, you may notice that all VID values are NAs. And instead of State codes, id column includes numeric values stored as characters:

    glimpse(spdf_Brasil_MVI)
    #> Rows: 581,207
    #> Columns: 8
    #> $ long  <dbl> -73.33251, -73.27482, -72.87016, -72.69151, -72.65935, -72.18967, -72.17318, -…
    #> $ lat   <dbl> -7.324879, -7.350334, -7.528882, -7.610621, -7.624979, -7.720489, -7.723841, -…
    #> ...
    #> $ id    <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"…
    #> $ VID   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
    

    Using sf package for handling spatial data and ggplot's geom_sf() layer for plotting sf objects simplifies this quite a bit:

    library(sf)
    library(ggplot2)
    library(dplyr)
    VIDBrazil <- data.frame(
      State = c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP"),
      VID = c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8))
    
    sf_Brasil <- st_read("brazil_geo.json", as_tibble = TRUE, quiet = TRUE)
    # sf object, a data.frame / tibble with all features (id, name) + geometry column
    # and projection details:
    sf_Brasil
    #> Simple feature collection with 27 features and 2 fields
    #> Geometry type: MULTIPOLYGON
    #> Dimension:     XY
    #> Bounding box:  xmin: -73.98971 ymin: -33.74708 xmax: -28.84694 ymax: 5.264878
    #> Geodetic CRS:  WGS 84
    #> # A tibble: 27 × 3
    #>    id    name                                                           geometry
    #>    <chr> <chr>                                                <MULTIPOLYGON [°]>
    #>  1 AC    Acre             (((-73.33251 -7.324879, -73.27482 -7.350334, -72.8701…
    #>  2 AL    Alagoas          (((-35.90153 -9.861805, -35.90153 -9.862083, -35.9012…
    #>  3 AP    Amapá            (((-50.02403 0.859862, -50.02403 0.859306, -50.02347 …
    # ...
    
    p1 <- ggplot(sf_Brasil) +
      geom_sf(fill="#69b3a2", color="white") +
      theme_void()
    
    sf_Brasil_MVI <- left_join(sf_Brasil, VIDBrazil, by = join_by(id == State))
    # join result, includes VID values:
    sf_Brasil_MVI
    #> Simple feature collection with 27 features and 3 fields
    #> Geometry type: MULTIPOLYGON
    #> Dimension:     XY
    #> Bounding box:  xmin: -73.98971 ymin: -33.74708 xmax: -28.84694 ymax: 5.264878
    #> Geodetic CRS:  WGS 84
    #> # A tibble: 27 × 4
    #>    id    name                                                     geometry   VID
    #>    <chr> <chr>                                          <MULTIPOLYGON [°]> <dbl>
    #>  1 AC    Acre             (((-73.33251 -7.324879, -73.27482 -7.350334, -7…  21.2
    #>  2 AL    Alagoas          (((-35.90153 -9.861805, -35.90153 -9.862083, -3…  31.8
    #>  3 AP    Amapá            (((-50.02403 0.859862, -50.02403 0.859306, -50.…  53.8
    # ...
    
    p2 <- ggplot(sf_Brasil_MVI) + 
      geom_sf(aes(fill = VID), color="green") +
      scale_fill_gradient(name = "Mortes por 100 mil habitantes", low = "#fdc0b2", high = "#9a1f03", limits = c(0, 60.0)) +
      theme_void() +
      theme(legend.position = "bottom")
    
    p1;p2
    

    For labelling sf objects we could use geom_sf_label() or geom_sf_text(), but when things can get bit crowded and it's not so easy to fit all labels, geom_label_repel() & geom_text_repel() form ggrepel package are worth considering.

    library(ggrepel)
    
    # add layer to existing p2 plot
    # get coordinates for geom_label_repel from sf_coordinates stats
    # name : column in p2 data (sf_Brasil_MVI) 
    # geometry : geometry column of p2 data (sf_Brasil_MVI)
    p2 + geom_label_repel(aes(label = name, geometry = geometry),
                          stat = "sf_coordinates")
    

    Created on 2023-03-12 with reprex v2.0.2