Search code examples
rgisgeospatialr-sfcentroid

How to calculate the centroid of a polygon shape file in r


I have this dataset: https://public.opendatasoft.com/explore/dataset/georef-netherlands-postcode-pc4/export/?location=6,52.1564,5.29337&basemap=jawg.light

I need to upload it to R and extract the centroid of the polygons in terms of longitude and latitude, so that I can use it to calculate distances between two centroids.

So far I've done this:

PC4shape <- read_sf(dsn = "georef-netherlands-postcode-pc4", layer = "georef-netherlands-postcode-pc4")

PC4shape$centroids <- st_transform(PC4shape, 28992) %>% 
  st_centroid() %>% 
  st_transform(., '+proj=longlat +ellps=GRS80 +no_defs') %>%
  st_geometry()

Following Function to calculate geospatial distance between two points (lat,long) using R

However, I am sure I have the projection and the EPSG wrong. I just don't know how to find the right one. Any help is greatly appreciated.


Solution

  • It isn't clear between which centroids you want to measure, so there's a sample below of couple of different ways to find some or all distances.

    The code below finds the centroid of each row, shows the distance between centroids of the first two rows, and then a small distance matrix of the first 10 rows.

    The crs is the original wgs 84, distances are in meters. I didn't see a need to change the crs.

    library(tidyverse)
    library(sf)
    
    
    x <- read_sf('shapefile_dir/') # Replace with local directory name
    
    x <- x %>% mutate(centroids = st_centroid(st_geometry(.)))
    
    #Distance between Goeree-Overflakkee (row 1) and Dordrecht (row 2):
    st_distance(x$centroids[1], x$centroids[2])
    #> Units: [m]
    #>          [,1]
    #> [1,] 53317.83
    
    #distance matrix of first 10 rows:
    x[1:10,] %>% 
      st_set_geometry('centroids') %>%  # Since there are 2 geometry cols, this sets the active one to 'centroids'
      st_distance()
    #> Units: [m]
    #>           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
    #>  [1,]     0.00 53317.826 49330.003 85844.745 76152.903 69198.288 89437.405
    #>  [2,] 53317.83     0.000  7355.978 38618.416 37934.810 31387.648 45356.081
    #>  [3,] 49330.00  7355.978     0.000 38510.970 34633.274 27580.688 44165.444
    #>  [4,] 85844.75 38618.416 38510.970     0.000 16873.684 19856.326  8741.436
    #>  [5,] 76152.90 37934.810 34633.274 16873.684     0.000  7298.181 14836.265
    #>  [6,] 69198.29 31387.648 27580.688 19856.326  7298.181     0.000 20611.458
    #>  [7,] 89437.40 45356.081 44165.444  8741.436 14836.265 20611.458     0.000
    #>  [8,] 84873.91 42274.715 40485.556  9634.087  9915.349 15823.963  4928.774
    #>  [9,] 88788.51 45795.090 44243.643 10476.055 13443.233 19690.611  2255.828
    #> [10,] 94857.25 50872.407 49779.022 13205.312 19479.782 25802.514  5617.354
    #>            [,8]      [,9]     [,10]
    #>  [1,] 84873.911 88788.511 94857.246
    #>  [2,] 42274.715 45795.090 50872.407
    #>  [3,] 40485.556 44243.643 49779.022
    #>  [4,]  9634.087 10476.055 13205.312
    #>  [5,]  9915.349 13443.233 19479.782
    #>  [6,] 15823.963 19690.611 25802.514
    #>  [7,]  4928.774  2255.828  5617.354
    #>  [8,]     0.000  3922.534  9994.837
    #>  [9,]  3922.534     0.000  6115.383
    #> [10,]  9994.837  6115.383     0.000
    
    # A distance matrix of all pairs of points can be found by not
    #  subsetting the data:
    #  x %>% st_set_geometry('centroids') %>% st_distance()
    

    Created on 2022-03-10 by the reprex package (v0.3.0)