I want to aggregate raster data to each polygon in a custom shapefile.
In this case, I want to obtain the mean degree of urbanization within subnational regions in Subsaharan Africa.
My sf looks like this:
> africa_map
Simple feature collection with 543 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -25.36042 ymin: -46.96575 xmax: 63.49391 ymax: 27.66147
Geodetic CRS: WGS 84
First 10 features:
cname ccode regname continent geometry
1 Angola AO Bengo Africa MULTIPOLYGON (((13.371 -8.5...
2 Angola AO Benguela Africa MULTIPOLYGON (((12.53336 -1...
3 Angola AO Bie Africa MULTIPOLYGON (((16.61158 -1...
4 Angola AO Cabinda Africa MULTIPOLYGON (((12.78266 -4...
5 Angola AO Cuando Cubango Africa MULTIPOLYGON (((21.9838 -16...
6 Angola AO Cuanza Norte Africa MULTIPOLYGON (((15.40788 -7...
7 Angola AO Cuanza Sul Africa MULTIPOLYGON (((13.7926 -11...
Or plotted:
The raster data, on the other hand, take this form:
> imported_raster
class : RasterLayer
dimensions : 18000, 36082, 649476000 (nrow, ncol, ncell)
resolution : 1000, 1000 (x, y)
extent : -18041000, 18041000, -9e+06, 9e+06 (xmin, xmax, ymin, ymax)
crs : +proj=moll +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs
names : GHS_BUILT_LDS1975_GLOBE_R2018A_54009_1K_V2_0
values : 0, 100 (min, max)
These are much more fine-grained than needed and for the whole planet. For the sake of accelerating computation, I first aggregated the raster, to then convert it to a shapefile with each remaining raster pixel converted to a point geometry within a shapefile. This shapefile, then, could be aggregated to my region borders. Admittedly, this is not very elegant (and, eventually, doesn't work).
library(tidyverse)
library(sf)
library(raster)
library(stars)
library(rgdal)
> # aggregate (to 25x25 km)
> imported_raster_ag <- aggregate(imported_raster, fact=25)
>
> # convert to sp
> urbanized = rasterToPolygons(imported_raster_ag)
> # convert to sf
> urbanized_sf <- st_as_sf(urbanized)
# compare projection
st_crs(africa_map)==st_crs(urbanized_sf)
# align projection
urbanized_sf <- st_transform(urbanized_sf, st_crs(africa_map))
urbanized_sf <- urbanized_sf %>% rename(urbanization = GHS_BUILT_LDS1975_GLOBE_R2018A_54009_1K_V2_0)
> urbanized_sf
Simple feature collection with 398872 features and 1 field (with 500 geometries empty)
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -179.9999 ymin: -57.40086 xmax: 179.9963 ymax: 82.33738
Geodetic CRS: WGS 84
First 10 features:
urbanization geometry
1 0 POLYGON ((-117.1367 82.3373...
2 0 POLYGON ((-116.2261 82.3373...
3 0 POLYGON ((-115.3156 82.3373...
4 0 POLYGON ((-114.405 82.33738...
5 0 POLYGON ((-113.4944 82.3373...
6 0 POLYGON ((-112.5838 82.3373...
7 0 POLYGON ((-111.6732 82.3373...
8 0 POLYGON ((-110.7627 82.3373...
9 0 POLYGON ((-109.8521 82.3373...
10 0 POLYGON ((-108.9415 82.3373...
The idea was that as of now, I could aggregate these points along the regional borders in the other sf. However, I get an error message and the only recommended fix I found only repeats it.
> urbanized_africa <- aggregate(urbanized_sf["urbanization"], by = africa_map$geometry, mean)
although coordinates are longitude/latitude, st_intersects assumes that they are planar
Error in CPL_geos_binop(st_geometry(x), st_geometry(y), op, par, pattern, :
Evaluation error: IllegalArgumentException: Invalid number of points in LinearRing found 2 - must be 0 or >= 4.
> urbanized_sf_fixed <- sf::st_make_valid(urbanized_sf)
Error in CPL_geos_make_valid(x) :
Evaluation error: IllegalArgumentException: Invalid number of points in LinearRing found 2 - must be 0 or >= 4.
I guess somewhere in my conversion process something could corrupted or something else is more fundamentally flawed. What are more elegant and, above all, more functional workflows to aggregate raster data into shapefile polygons?
Have a look to the extract
function
In your case something like
africamap$urbanized <- extract(imported_raster, africamap, fun="mean")
africamap
should still be projected to the raster projection first. If you have nodata
values then it should be wise to add a na.rm=TRUE
parameter to the call.
To improve processing time you could also crop the raster to africa.
There is also a faster version of extract
in the exactextractr
package. You could also take advantage of using terra
instead of raster
for raster processing.