Search code examples
rggplot2shapefile

Using shapefile to fill geom_polygon


I am trying to plot a chloropleth map of the population of Nigeria. For this I downloaded the population data from the wopr package

library(remotes)
remotes::install_github('wpgp/wopr')
library(wopr)

catalogue <- getCatalogue()
selection <- subset(catalogue,
                    country == 'NGA' &
                      category == 'Population' & 
                      version == 'v1.2')
downloadData(selection)

Then I unzipped the downloaded data and read in the .shp file - which also contains the mean population data of administrative level 3 divisions in Nigeria.

library(rgdal)

shape <- readOGR(here::here("wopr/NGA/population/v1.2/NGA_population_v1_2_admin/NGA_population_v1_2_admin_level3_boundaries.shp"))

shape

class       : SpatialPolygonsDataFrame 
features    : 774 
extent      : 2.6925, 14.67797, 4.271484, 13.88571  (xmin, xmax, ymin, ymax)
crs         : +proj=longlat +datum=WGS84 +no_defs 
variables   : 18
names       : lgacode,   lganame, statecode,          source,               timestamp,                             globalid,    amapcode,   id, statename, region,         mean,       q025,       q05,       q25,     q50, ... 
min values  :   10001, Aba North,        AB, abraham.oluseye, 2015/08/08 11:30:41.000, 000fac93-f92b-4f2b-a003-03318fe407c1, NIE ABS ABA, 2845,      Abia,      1,     252.3734,        115,       136,       195,     237, ... 
max values  :    9018,      Zuru,        ZA,             WHO, 2018/08/07 08:35:09.000, ff682d23-27fa-4395-8b26-d2bc5803e7c2, NIE ZAS ZRM, 3664,   Zamfara,     11, 2181858.3981, 1622184.45, 1662922.4, 1834489.5, 2096976, ... 

head(shape@data)

  lgacode          lganame statecode      source               timestamp
0   27011        Kontagora        NI EHA_ABRAHAM 2017/01/22 15:55:38.000
1   27015           Mariga        NI EHA_ABRAHAM 2017/01/25 18:08:13.000
2   25004     Amuwo Odofin        LA NGA_TEAMGIS 2018/08/07 08:35:09.000
3   25002 Ajeromi Ifelodun        LA NGA_TEAMGIS 2018/08/07 08:35:09.000
4   25018         Surulere        LA EHA-OLUSEYE 2016/05/10 14:03:50.000
5   31014              Ido        OY EHA-ABRAHAM 2016/11/02 10:39:49.000
                              globalid    amapcode   id statename region     mean
0 74b7e7e5-66fb-4a11-961f-f66b657df869 NIE NIS KNT 3618     Niger      2 235330.7
1 344d9dce-9643-4c16-a1f0-595d97dea13c NIE NIS BMG 3619     Niger      2 373467.9
2 2a5b0ca2-4065-45f2-9a2f-2545dc1fe9c3 NIE LAS FST 3635     Lagos     11 377541.3
3 b44f187e-1ebd-4ef3-9bc3-3e2be785e640 NIE LAS AGL 3636     Lagos     11 314222.6
4 95c41101-4143-4abf-8adb-f63b79b09555 NIE LAS LSR 3637     Lagos     11 118282.3
5 6777fa86-afd3-4532-bf38-d897dce835d3 NIE OYS DDA 3620       Oyo      7 531508.6
      q025       q05      q25      q50      q75      q95     q975
0 153685.4 163240.50 196365.8 225030.0 260180.2 341048.0 378108.2
1 159215.6 182313.50 270608.8 338850.0 435652.8 668303.5 775215.3
2 284024.3 296227.30 333860.8 367006.5 408052.5 493519.2 529741.5
3 183500.2 198570.15 252164.5 300170.5 358764.5 475849.5 527268.7
4  69839.8  75569.75  95543.5 112952.5 134939.2 178012.4 196889.1
5 435055.2 448660.55 492885.5 525545.5 563986.0 631545.6 659961.0

I want to plot the boundary of Nigeria and fill the administrative level 3 divisions by population (shape@data$mean)

library(ggplot2)
ggplot(data = shape) +
  geom_polygon(aes(x = long, y = lat, group = group, fill = mean))

But I get the error "Error: Aesthetics must be valid data columns. Problematic aesthetic(s): fill = mean. Did you mistype the name of a data column or forget to add after_stat()?"


Solution

  • An option is to use the sf package which is perfect for this kind of job.

    Just add a step to convert the file you just read with rgdal to sf and then use the function geom_sf to plot the desired boundaries.

    # install.packages("sf")
    library(tidyverse)
    shape %>% 
      sf::st_as_sf() %>% 
      ggplot(aes(fill = mean)) +
      geom_sf()
    

    enter image description here