Search code examples
rmappingshapefiler-sftmap

Get more aggregate shapes from shapefile


As described in this previous question, I'm plotting German zip codes. The most granular level is 5-digit, e.g. 10117. I would like to draw boundaries defined on the two digit level around, while also keeping the granularity of the zip codes for coloring.

Here again the first part of my code to color in zip codes:

# for loading our data
library(raster)
library(readr)
library(readxl)
library(sf)
library(dplyr)

# for datasets
library(maps)
library(spData)

# for plotting
library(grid)
library(tmap)
library(viridis)

Get shape files for Germany (link). In Germany zip codes are called Postleitzahlen (PLZ).

germany <- read_sf("data/OSM_PLZ.shp")

Classify PLZs into arbitrary groups for plotting.

germany <- germany %>% 
  mutate(plz_groups = case_when(
    substr(plz, 1, 1) == "1" ~ "Group A",
    substr(plz, 2, 2) == "2" ~ "Group B",    
    TRUE ~ "Group X" # rest
  ))

Make plot filling by PLZ:

tm_shape(germany) +
  tm_fill(col = "plz_groups") 

enter image description here

The boundaries I want to draw are the following:

enter image description here

I managed to draw German state boundaries, building on the answer in the other post. But for this I used a different shape file. I haven't found existing shapefiles on the level I'm looking for.

Can I use the granular shapefile I already have and somehow it aggregate up to the 2-digit zip code level?


Solution

  • You can create a new variable on your object containing the first two characters of the zip code and use dplyr::group_by() to combine each zip code into a zone.

    See a reprex with just a subset of the zip codes, since this operation takes a long time on your dataset:

    url <- "https://opendata.arcgis.com/api/v3/datasets/5b203df4357844c8a6715d7d411a8341_0/downloads/data?format=geojson&spatialRefId=4326&where=1%3D1"
    
    tmp <- "DEU.geojson"
    download.file(url, tmp, mode = "wb")
    
    library(sf)
    library(dplyr)
    library(tmap)
    
    germany <- st_read(tmp) |> st_make_valid()
    
    # Create an index
    germany$plz_groups <- substr(germany$plz, 1, 2)
    
    # order levels
    lvs <- unique(sort(germany$plz_groups))
    
    germany$plz_groups <- factor(germany$plz_groups, levels = lvs)
    
    
    # Group with dplyr
    
    germany_group <- germany %>%
      # For the example, just from 01 to 8
      # You should remove this filter
      filter(plz_groups  %in% lvs[1:8]) %>%
      select(plz_groups) %>%
      group_by(plz_groups) %>%
      summarise(n = n()) %>%
      # Additionally remove holes
      nngeo::st_remove_holes()
    
    tm_shape(germany_group) +
      tm_polygons() +
      tm_shape(germany_group) +
      tm_text(text = "plz_groups")
    
    
    

    enter image description here