Finding ratio of perimeter with surrounding neighbors in R

I have a shape file at Sample Data

I read the data in R as sf object named "SD" and have columns "GEOID" and "geometry" along with other columns. I created a new column in the data named "POP"

SD$POP <- sample(2000:8000, 35)

First, I want to measure the area and perimeter for each "GEOID".

Second, I need to find how ratio of each GEOID's perimeter shared with it's surrounding GEOIDs. For example: say GEOID 09 is surrounded by GEOID 06, 08, 10, 13. I want to find what percent of GEOID 09's perimeter is shared with 06, 08, 10, and 13 individually. Then Create new columns "POP_PERCENT" and "ADJ_POP" which will be

ADJ-POP = SUM OF POP for GEOID 06, 08, 10, and 13' .

POP_PERCENT =  Percent share with 06 + Percent share with 08 + Percent share with 10 +
Percent share with 13

My apologies for verbose question. I would really appreciate any help.

Solution

I think generally what you need is st_intersection plus some data manipulation. The dataset does not have a GEOID column, and I'm not completely clear what the expected output for POP_PERCENT is (adjacent population weighted by percent perimeter overlap?). So this isn't a totally complete answer.

Here is a start using sf and some other tidyverse functions. This calculates the area, perimeter, and intersections of each polygon. I created a GEOID column from the rownames. Hopefully that is enough to get you going in the right direction.

First calculate area and perimeter of each

library(tidyverse)
library(sf)

SD_ap <- SD %>%
  st_transform(5070) %>% #convert from lat/long to projected coord system
  mutate(area = st_area(.),
         length_perim = lwgeom::st_perimeter(.))

#-----
  GEOID  POP                       geometry              area         perim
1     1 6582 MULTIPOLYGON (((860088.4 12...   357945281 [m^2]  139932.2 [m]
2     2 3199 MULTIPOLYGON (((843395.6 11...   137754181 [m^2]  100879.4 [m]

Then calculate the overlapping perimeters. Convert the polygons to lines, then run st_intersection. Instead of the 35 polygons, this returns 141 line segments, distinguished by original overlapping polygons (origins)

SD_overlap <- SD_ap %>%
  st_cast('MULTILINESTRING') %>%
  st_intersection() %>%
  mutate(length_intersection = st_length(.)) %>%
  filter(length_intersection > units::as_units(0,'m'))

#----
     GEOID  POP              area  length_perim n.overlaps origins length_intersection
1        1 6559   357945281 [m^2]  139932.2 [m]          2    1, 2        6761.611 [m]
2        2 4781   137754181 [m^2]  100879.4 [m]          2    2, 3       24201.037 [m]
2.2      2 4781   137754181 [m^2]  100879.4 [m]          2    2, 4       31628.198 [m]
3        3 3424   295676500 [m^2]  212930.4 [m]          2    3, 4       22967.725 [m]

Plot the first 4 GEOIDs to check this is working out as expected

ggplot(data = filter(SD, GEOID %in% 1:4),
       aes(fill = as.factor(GEOID),
           label = GEOID)) +
  geom_sf(alpha = .2) +
  geom_sf(data = filter(SD_overlap, map_lgl(origins, ~all(.x %in% 1:4))), 
          aes(color = map_chr(origins, ~paste(.x, collapse = ','))),
          fill = NA,
          lwd = 2,
          show.legend = 'line') +
  geom_sf_label(fill = "white",  # override the fill from aes()
                fun.geometry = sf::st_centroid) +
  guides(fill = 'none')+
  labs(color='Overlap')

To answer the question of what populations are adjacent, you can try something like this.

sum_adj_pops <- function(GEOID_i){
  
  ID_adj <- SD_overlap %>%
    st_drop_geometry() %>%
    filter(map_lgl(origins, ~GEOID_i %in% .x)) %>%
    unnest_longer(origins) %>%
    filter(origins != GEOID_i) %>%
    pull(origins)
  
  SD %>%
    st_drop_geometry() %>%
    filter(GEOID %in% ID_adj) %>%
    summarize(POP_adj = sum(POP)) %>%
    pull(POP_adj)
}

SD %>%
  mutate(POP_adj = map_dbl(GEOID, sum_adj_pops))

#----
   GEOID  POP                       geometry POP_adj
1      1 6559 MULTIPOLYGON (((-86.64623 3...   24024
2      2 4781 MULTIPOLYGON (((-86.861 33....   27132
3      3 3424 MULTIPOLYGON (((-86.97402 3...   21907
4      4 4230 MULTIPOLYGON (((-86.74407 3...   22259
5      5 7054 MULTIPOLYGON (((-87.78275 3...   22021

Data

url_shp <- 'https://github.com/ahadzaman1002/example_data/raw/main/fe_2007_01_sldu00.zip'

tmp_zip <- tempfile(fileext = '.zip')
tmp_dir <- tempdir()
download.file(url = url_shp, tmp_zip)
unzip(tmp_zip, exdir = tmp_dir)

path_shp <- list.files(tmp_dir, '.shp$', full.names = TRUE)

set.seed(415) #for reproducibility with sample()

SD <- st_read(path_shp) %>%
  mutate(POP = sample(2000:8000, 35)) %>%
  rowid_to_column('GEOID')