I've been trying to plot a cloropleth map. I've made some progress, but I'm stuck now. The idea is to plot the map of Brazil and color code each state bases on the number of violent intentional deaths in 2021.
I'm using the following dataset:
State <- c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP")
VID <- c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8)
VIDBrazil <- data.frame(State, VID)
The geoJSON data file from Brazil I'm sourcing from this Brasil JSON file
The code I'm trying to use is based on resources found here, especially here, here, here and here
The code I'm using is the following:
State <- c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP")
VID <- c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8)
VIDBrazil <- data.frame(State, VID)
spdf_Brasil <- geojson_read("C:/Users/Giovanni/Desktop/R/brazil_geo.json", what = "sp") #Import JSON file
spdf_Brasil_A<- tidy(spdf_Brasil) #Convert to data frame
ggplot() + #plot the map
geom_polygon(data = spdf_Brasil_A, aes( x = long, y = lat, group = group), fill="#69b3a2", color="white") +
theme_void() +
coord_map()
spdf_Brasil_MVI <- spdf_Brasil_A %>% #add the values of deaths to use for color coding the map
left_join(. , VIDBrazil, by=c("id"="State"))
ggplot() + #ploting with color coding
geom_polygon(data = spdf_Brasil_MVI, aes( x = long, y = lat, group = group, fill= VID), color="green") +
theme_void() +
scale_fill_gradient(name = "Mortes por 100 mil habitantes", low = "#fdc0b2", high = "#9a1f03", limits = c(0, 60.0)) +
coord_map()
What am I missing? What should I do to get the map color coded?
Thank you for your assistance!
id
field in GeoJSON is not handled by broom::tidy()
as-is, so there are no State codes in the resulting data frame and join will not work as intended. Though the column named id
is still there, thus left_join()
will not throw an error. If you check your spdf_Brasil_MVI
, you may notice that all VID
values are NA
s. And instead of State codes, id
column includes numeric values stored as characters:
glimpse(spdf_Brasil_MVI)
#> Rows: 581,207
#> Columns: 8
#> $ long <dbl> -73.33251, -73.27482, -72.87016, -72.69151, -72.65935, -72.18967, -72.17318, -…
#> $ lat <dbl> -7.324879, -7.350334, -7.528882, -7.610621, -7.624979, -7.720489, -7.723841, -…
#> ...
#> $ id <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"…
#> $ VID <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
Using sf
package for handling spatial data and ggplot's geom_sf()
layer for plotting sf objects simplifies this quite a bit:
library(sf)
library(ggplot2)
library(dplyr)
VIDBrazil <- data.frame(
State = c("SP", "SC", "DF", "MG", "RS", "MS", "PR", "AC", "PI","TO", "MT", "RO", "GO", "RJ", "ES", "MA", "PB", "AL", "RN", "PA", "SE", "PE", "RR", "CE", "AM", "BA", "AP"),
VID = c(7.9, 10.1, 11.2, 11.4, 15.9, 20.7, 20.8, 21.2, 23.8, 24.3, 24.9, 25.0, 26.1, 27.2, 28.2, 28.3, 28.6, 31.8, 32.4, 32.8, 33.9, 34.8, 35.5, 37.0, 39.1, 44.9, 53.8))
sf_Brasil <- st_read("brazil_geo.json", as_tibble = TRUE, quiet = TRUE)
# sf object, a data.frame / tibble with all features (id, name) + geometry column
# and projection details:
sf_Brasil
#> Simple feature collection with 27 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -73.98971 ymin: -33.74708 xmax: -28.84694 ymax: 5.264878
#> Geodetic CRS: WGS 84
#> # A tibble: 27 × 3
#> id name geometry
#> <chr> <chr> <MULTIPOLYGON [°]>
#> 1 AC Acre (((-73.33251 -7.324879, -73.27482 -7.350334, -72.8701…
#> 2 AL Alagoas (((-35.90153 -9.861805, -35.90153 -9.862083, -35.9012…
#> 3 AP Amapá (((-50.02403 0.859862, -50.02403 0.859306, -50.02347 …
# ...
p1 <- ggplot(sf_Brasil) +
geom_sf(fill="#69b3a2", color="white") +
theme_void()
sf_Brasil_MVI <- left_join(sf_Brasil, VIDBrazil, by = join_by(id == State))
# join result, includes VID values:
sf_Brasil_MVI
#> Simple feature collection with 27 features and 3 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -73.98971 ymin: -33.74708 xmax: -28.84694 ymax: 5.264878
#> Geodetic CRS: WGS 84
#> # A tibble: 27 × 4
#> id name geometry VID
#> <chr> <chr> <MULTIPOLYGON [°]> <dbl>
#> 1 AC Acre (((-73.33251 -7.324879, -73.27482 -7.350334, -7… 21.2
#> 2 AL Alagoas (((-35.90153 -9.861805, -35.90153 -9.862083, -3… 31.8
#> 3 AP Amapá (((-50.02403 0.859862, -50.02403 0.859306, -50.… 53.8
# ...
p2 <- ggplot(sf_Brasil_MVI) +
geom_sf(aes(fill = VID), color="green") +
scale_fill_gradient(name = "Mortes por 100 mil habitantes", low = "#fdc0b2", high = "#9a1f03", limits = c(0, 60.0)) +
theme_void() +
theme(legend.position = "bottom")
p1;p2
For labelling sf objects we could use geom_sf_label()
or geom_sf_text()
, but when things can get bit crowded and it's not so easy to fit all labels, geom_label_repel()
& geom_text_repel()
form ggrepel
package are worth considering.
library(ggrepel)
# add layer to existing p2 plot
# get coordinates for geom_label_repel from sf_coordinates stats
# name : column in p2 data (sf_Brasil_MVI)
# geometry : geometry column of p2 data (sf_Brasil_MVI)
p2 + geom_label_repel(aes(label = name, geometry = geometry),
stat = "sf_coordinates")
Created on 2023-03-12 with reprex v2.0.2