Search code examples
rggplot2geospatialr-sf

Draw a line from polygon centroid to maximum distance border in R


I have polygon shape data for German postcodes. For each postcode I like to calculate the maximum distance from centroid to its border and illustrate this on a map for some of them. I found a post which calculates this maximum via sf package and st_cast() and st_distance(). My data comes as sf dataframe.

How to compute the greatest distance between a centroid and the edge of the polygon using the SF package?

library(sf)
library(tidyverse)

# Get German postcode shape polygons
URL <- "https://downloads.suche-postleitzahl.org/v2/public/plz-5stellig.shp.zip"

# use GDAL virtual file systems to load zipped shapefile from remote url
GER_postcode <- paste0("/vsizip//vsicurl/", URL) %>%  read_sf()

# convert numeric
GER_postcode$plz <- as.numeric(GER_postcode$plz)

# filter a specific postcode
test <- GER_postcode %>% filter(plz == 15232)

# compute distances 
distances <- test %>% 
  st_cast("POINT") %>% 
  st_distance(st_centroid(test))

# maximum dist:
max_dist <- max(distances)
max_dist

ggplot() +
  geom_sf(data = test, size = 1, shape = 16, alpha = 0.1) + # shape
  geom_sf(data = st_centroid(test)) + # centroid 
  theme_bw()

Where exactly is the found maximum (1297.496 [m]) and how can I show the connection on the map?

enter image description here


Solution

  • Your code calculates the maximum distance by casting the border MULTIPOLYGON to the set of points of which the that polygon is comprised, then calculating the distance to each of those points.

    So what we can do is find which of those points is the maximum distance away, create an sf data frame which contains that point and the centroid and summarise() them into a LINESTRING using st_cast().

    # create sf object of border points
    border_points <- test %>%
        st_cast("POINT")
    
    # compute distances
    distances <- border_points |>
        st_distance(st_centroid(test))
    
    max_dist_linestring <- border_points |>
        filter(as.logical(distances == max(distances))) |>
        bind_rows(st_centroid(test)) |>
        summarise() |>
        st_cast("LINESTRING")
    
    ggplot() +
        geom_sf(data = test, size = 1, shape = 16, alpha = 0.1) + # shape
        geom_sf(data = st_centroid(test)) + # centroid
        geom_sf(data = max_dist_linestring) +
        theme_bw()
    

    enter image description here

    A note on calculating centroids and projections

    Your data in lon/lat format. st_crs(GER_postcode) returns 4326, i.e. WGS84, a lat/lon system. However, st_centroid() does not give accurate results for lat/lon data.

    You should transform the data to a projected coordinate system, i.e. a plane. As your data is Germany, you might want to use DE_ETRS89. You can do this with:

    GER_postcode <- st_transform(GER_postcode, crs = 25831)
    

    If you choose a different CRS, just make sure that st_is_longlat(GER_postcode) is FALSE. This will get you a more accurate maximum distance. It makes a difference of about 10 metres in the example you posted. However, depending on the location, you could get a completely incorrect result (i.e. a line which is not actually the furthest distance). See the London projected vs geographic buffer plot here for more.