Search code examples
rgeometrygisrasterr-sf

Get percentage of overlap between two multipolygons in R


I have two MULTYPOLYGONS, one where a row is a election district*year pair, and the other is a collection of flood events.

 district.df
      year ward_ons     cycle                       geometry
    1 2007       E1   NA-2007 POLYGON ((527370.8 183470.7...
    2 2008       E1 2007-2008 POLYGON ((528891.1 182192.6...
    3 2009       E2   NA-2009 POLYGON ((370294.2 414678.7...
    4 2010       E3   NA-2010 POLYGON ((375025.4 414992.1...
    5 2011       E3 2010-2011 POLYGON ((375150.8 410809.8...
    6 2018       E3 2011-2018 POLYGON ((373286.3 414364.5...
    7 2007       E4   NA-2007 POLYGON ((373168.6 411597.8...
    8 2010       E4 2007-2010 POLYGON ((374783.2 406209.4...

Flood data:

    flood.df
    Simple feature collection with 8 features and 2 fields
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 317656.2 ymin: 90783.2 xmax: 546460.6 ymax: 631125.7
projected CRS:  OSGB 1936 / British National Grid
  year            name                       geometry
1 2007      River 2007 POLYGON ((359637.7 268239.7...
2 2007       Tank 2007 POLYGON ((325444.1 92717.57...
3 2008  Yorkshire 2008 POLYGON ((318550.7 103058.8...
4 2009 Flood East 2009 POLYGON ((541472.6 112593, ...
5 2010  Occurence 2010 MULTIPOLYGON (((545863.4 11...
6 2012      Storm 2012 POLYGON ((473637.4 103927, ...
7 2011      Flood 2011 MULTIPOLYGON (((524617.6 42...
8 2017      River 2017 POLYGON ((393387.6 631125.7...

What I am trying to do is to get a column in the election district multypolygon which is the share of its area that overlaps with any of the polygons in the flood multipolygon with the same value for year.

This is what I tried (influenced by https://gis.stackexchange.com/questions/140504/extracting-intersection-areas-in-r)

# Create function
overlap.fraction <- function(election, flood, yeari){
  flood.year <- flood[flood$year == yeari,]
  election.year <- election[election$year == yeari, ]
for (i in 1:nrow(election.year)){
  int.i <- as_tibble(st_intersection(i, flood.year))
  int.i$affected <- st_area(int.i$geoms)
  affected.by.county(paste0(yeari)) <- int.i%>%
  dplyr::group_by(code) %>%
  dplyr::summarise(affected.area = sum(affected))
}
}

And then tried it

i.1990 <- overlap.fraction(district.df, flood.df, yeari = 1990)

But keep getting:

Error in UseMethod("st_intersection") : no applicable method for 'st_intersection' applied to an object of class "c('integer', 'numeric')"

I tried different alterations to the function and the loop inside it. Ideally, I would even get a function or loop that would do it at once instead of having to do it for every year, as is the case for what I am trying here. I am doing it this way because I am getting closer to getting it done this way, but I understand this is not the most effective way.

Any help would be great. Thanks!


Solution

  • Let's create a function to do it by year.

    library(sf)
    
    ## storing years in a variable
    years <- unique(district.df$year)
    
    ## creating an "auxiliary" geometry
    flood.df$geom_2 <- st_geometry(flood.df)
    
    ## function to calculate this "proportion of intersection"
    ## inputs: y = year; dt1 = dataset1 (district); dt2 = dataset2(flood)
    prop_intersect <- function(y, dt1, dt2) {
      ## filtering year
      out1 <- dt1[dt1$year == y,]
      out2 <- dt2[dt2$year == y,]
    
      ## calculating the areas for the first dataset
      out1 <- transform(out1, dist_area = as.numeric(st_area(geometry)))
    
      ## joining the datasets (polygons that intersect will be joined).
      ## It would be nice to have id variables for both datasets
      output <- st_join(
        x    = out1,
        y    = out2,
        join = st_intersects
      )
    
      ## calculating area of intersection
      output <- transform(output, 
                          inter_area = mapply(function(x, y) {
                                          as.numeric(sf::st_area(
                                               sf::st_intersection(x, y)
                                           ))}, x = geometry, y = geom_2))
    
      ## calculating proportion of intersected area
      output <- transform(output, prop_inter = inter_area/dist_area)
    
      return(output)
    }
    

    So, now we have a function to do what you request. It is very likely that there exists a better (more efficient and "clean") way to do it. However, this is what I could come up with. Also, since this is not a reprex, it is hard for me to know whether this code works or not.

    That being said, now we can iterate on "year" as follows

    final_df <- lapply(years, prop_intersect, dt1 = district.df, dt2 = flood.df)
    
    final_df <- do.call("rbind", final_df)