Search code examples
rrasterr-sfr-rasterrasterizing

Rasterizing polygons with complicated weighting


Imagine a regular 0.5° grid across the Earth's surface. A 3x3 subset of this grid is shown below. As a stylized example of what I'm working with, let's say I have three polygons—yellow, orange, and blue—that for the sake of simplicity all are 1 unit in area. These polygons have attributes Population and Value, which you can see in the legend:

Example grid with polygons and data

I want to turn these polygons into a 0.5° raster (with global extent) whose values are based on the weighted-mean Value of the polygons. The tricky part is that I want to weight the polygons' values based on not their Population, but rather on their included population.

I know—theoretically—what I want to do, and below have done it for the center gridcell.

  1. Multiply Population by Included (the area of the polygon that is included in the gridcell) to get Pop. included. (Assumes population is distributed evenly throughout polygon, which is acceptable.)
  2. Divide each polygon's Included_pop by the sum of all polygons' Included_pop (32) to get Weight.
  3. Multiply each polygon's Value by Weight to get Result.
  4. Sum all polygons' Result to get the value for the center gridcell (0.31).
Population Value Frac. included Pop. included Weight Result
Yellow 24 0.8 0.25 6 0.1875 0.15
Orange 16 0.4 0.5 8 0.25 0.10
Blue 18 0.1 1 18 0.5625 0.06
32 0.31

I have an idea of how to accomplish this in R, as described below. Where possible, I've filled in code that I think will do what I want. My questions: How do I do steps 2 and 3? Or is there a simpler way to do this? If you want to play around with this, I have uploaded old_polygons as a .rds file here.

library("sf")
library("raster")
  1. Calculate the area of each polygon: old_polygons$area <- as.numeric(st_area(old_polygons))
  2. Generate the global 0.5° grid as some kind of Spatial object.
  3. Split the polygons by the grid, generating new_polygons.
  4. Calculate area of the new polygons: new_polygons$new_area <- as.numeric(st_area(new_polygons))
  5. Calculate fraction included for each new polygon: new_polygons$frac_included <- new_polygons$new_area / new_polygons$old_area
  6. Calculate "included population" in the new polygons: new_polygons$pop_included <- new_polygons$pop * new_polygons$frac_included
  7. Calculate a new attribute for each polygon that is just their Value times their included population. new_polygons$tmp <- new_polygons$Value * new_polygons$frac_included
  8. Set up an empty raster for the next steps: empty_raster <- raster(nrows=360, ncols=720, xmn=-180, xmx=180, ymn=-90, ymx=90)
  9. Rasterize the polygons by summing this new attribute together within each gridcell. tmp_raster <- rasterize(new_polygons, empty_raster, "tmp", fun = "sum")
  10. Create another raster that is just the total population in each gridcell: pop_raster <- rasterize(new_polygons, empty_raster, "pop_included", fun = "sum")
  11. Divide the first raster by the second to get what I want:
output_raster <- empty_raster
values(output_raster) <- getValues(tmp_raster) / getValues(pop_raster)

Any help would be much appreciated!


Solution

  • Example data:

    library(terra)
    f <- system.file("ex/lux.shp", package="terra")
    v <- vect(f)
    values(v) <- data.frame(population=1:12, value=round(c(2:13)/14, 2))
    r <- rast(ext(v)+.05, ncols=4, nrows=6, names="cell")
    

    Illustrate the data

    p <- as.polygons(r)
    plot(p, lwd=2, col="gray", border="light gray")
    lines(v, col=rainbow(12), lwd=2)
    txt <- paste0(v$value, " (", v$population, ")") 
    text(v, txt, cex=.8, halo=TRUE)
    

    enter image description here

    Solution:

    # area of the polygons
    v$area1 <- expanse(v)
    # intersect with raster cell boundaries
    values(r) <- 1:ncell(r) 
    p <- as.polygons(r)
    pv <- intersect(p, v)
    
    # area of the polygon parts
    pv$area2 <- expanse(pv)
    pv$frac <- pv$area2 / pv$area1
    

    Now we just use the data.frame with the attributes of the polygons to compute the polygon-cover-weighted-population-weighted values.

    z <- values(pv)
    a <- aggregate(z[, "frac", drop=FALSE], z[,"cell",drop=FALSE], sum)
    names(a)[2] <- 'fsum'
    z <- merge(z, a)
    z$weight <- z$population * z$frac / z$fsum
    z$wvalue <- z$value * z$weight
    b <- aggregate(z[, c("wvalue", "weight")], z[, "cell", drop=FALSE], sum)
    b$bingo <- b$wvalue / b$weight
    

    Assign values back to raster cells

    x <- rast(r)
    x[b$cell] <- b$bingo
    

    Inspect results

    plot(x)
    lines(v)
    text(x, digits=2, halo=TRUE, cex=.9)
    text(v, "value", cex=.8, col="red", halo=TRUE)
    

    enter image description here

    This may not scale very well to large data sets, but you could perhaps do it in chunks.