Goal: Produce an updated gridded population raster for 2023 based on 2020 gridded population raster and 2023 polygon population estimates.
To do this, I have taken the approach to try to update all cell values in a raster with new cell values stored in a table by cell number.
I have a large raster with gridded population estimates from 2020, county polygons, and a table of 2023 population by county. I would like to update the cell values in the 2020 raster in the same distribution/density, but reflecting the 2023 population.
I have started with the following:
exactextractr::exact_extract
to sum the 2020 cell values (population) within each county polygon.terra:extract
to extract all cell values in the 2020 rasterI have a table with the county ID, cell number, old cell value, and new cell value, but I'm unsure how to update the 2020 raster with the new values. If this was a dataframe, I would join by cell number or just directly assign the new values, assuming the same row order. I have looked at the following SO articles (and many more), but they seem to address only conditional value replacement.
Below is a somewhat reproducible example (thanks to Zonal Statistics R (raster/polygon) for the generic raster and polygons). The cell numbers are not showing up in the extract
step the way they do with the real data, but otherwise quite similar. Ideally, I would like to join by cell number to have some peace of mind and not be dependent on the row order.
library(terra)
library(exactextractr)
library(tidyverse)
# Create raster
ras <- raster(nrows=100, ncols=80, xmn=0, xmx=1000, ymn=0, ymx=800)
val <- runif(ncell(ras))
values(ras) <- val
# Create polygon within raster extent
xym <- cbind(runif(3,0,1000), runif(3,0,800))
p <- Polygons(list(Polygon(xym)),1)
sp <- SpatialPolygons(list(p))
spdf <- SpatialPolygonsDataFrame(sp, data=data.frame(1))
# Mask raster to within polygon
r2 <- mask(ras, spdf)
plot(r2)
lines(spdf)
#Create table of for new polygon value
new_tbl <-
tibble(X1 = 1,
new_sum = 10000)
#1. Use ```extact_extractr::exact_extract``` to sum the 2020 cell values within each polygon.
poly_pop <- exact_extract(
r2,
spdf,
fun = 'sum',
weights = 'area',
append_cols = T
)
#2. Use ```terra:extract``` to extract all cell values in the old raster
r2_cell_values <-
terra::extract(
r2, # raster file
spdf, # polygons
df = TRUE,
cells = TRUE, # Include cell numbers
ID = TRUE, # Include polygon IDs
exact = TRUE # Include the proportion of the cell covered by the polygon
)
#3. Joined 1 and 2 to calculate the proportion of the value for each cell (i.e. cell value / polygon sum).
cell_props <-
r2_cell_values %>%
filter(!is.na(layer)) %>% # Remove NA values
left_join(poly_pop, by = c("ID" = "X1")) %>%
mutate(
cell_prop = layer / sum
)
#4. Joined thenew values to 3 and calculated new cell values reflecting the mew polygon value (new polygon value * proportion of old polygon value in each cell)
new_cell_tbl <-
cell_props %>%
left_join(new_tbl, by = c("ID" = "X1")) %>%
mutate(new_cell_value = cell_prop * new_sum)
#5. I can't seem to figure out how to get the new cell values into the 2020 raster.
x <- rasterize(spdf, r2, 1)
cellStats(x, 'sum') == nrow(new_cell_tbl) # Check whether number of values in r2 and new_cell_tbl are the same
values(r2) <- new_cell_tbl$new_cell_value
I end up with this error message: Error in setValues(x, value) : length(values) is not equal to ncell(x), or to 1
, even though I try to test whether the number of raster cells and length of the new vector are the same in the step before.
I'm fairly new to this, so any guidance on how I can update the 2020 cells based on the 2023 admin area values would be appreciated. Thanks!
Example data
library(terra)
pop <- rast(system.file("ex/elev.tif", package="terra"))
names(pop) <- "pop2020"
adm <- vect(system.file("ex/lux.shp", package="terra"))
Compute relative population per admin unit for "2020"
adm <- extract(pop, adm, "sum", na.rm=TRUE, bind=TRUE)
pop_agg <- rasterize(adm, pop, "pop2020")
rel_pop <- pop / pop_agg
Use the relative distribution to compute the raster cell counts for "2030"
#example data
adm$pop2030 <- adm$pop2020 * (1:12) / 2
# solution
new_pop <- rasterize(adm, pop, "pop2030") * rel_pop