Search code examples
rrasterdata-manipulation

Any efficient way to obtain area estimation with pixel counting approach for land cover raster in R?


I want to get area estimation for each polygon by using pixel counting method. The original landcover map was raster data that come with a distinctive class assigned to each pixel (landcover map legend). However, to get area estimation by using pixel counting is not intuitive to me. Perhaps, the first things I can do is extract all the pixels for each polygon which present information about the distribution of land cover classes within each polygon. After that, I need to use pixel counting method to get area estimation and aggregate all land/soil coverage for the city, agriculture area for each polygon. For me, using pixel counting to obtain area estimation is not straightforward and don't have a solid idea how to get this done in R.

Note that original landcover map is rather big (about 92mb) and it is difficult to produce reproducible example for landcover raster, so forgive me for such inconveniece. Here is the R script to acquire landcover raster:

library(raster)
library(R.utils)

url = "https://cidportal.jrc.ec.europa.eu/ftp/jrc-opendata/LUISA/PrimaryOutput/Europe/REF-2014/JRC_LUISA_Input_Corine_land_cover_2006_r_ref_2014.zip"
download.file(url, basename(url))
gunzip(basename(url))
rname <- list.files(getwd(), "tif$")
land_cover <- raster::raster("~/LUISA_CLC_land_coverage/clc06_r.tif")

I want to aggregate all land/soil coverage information for each polygon (total 403 polygon that land cover information to be aggregated); Here are polygons I am gonna use for area estimation: shapefile with polygons are available on the fly:

enter image description here

I cropped original landcover raster as follow:

deu_shp = maptools::readShapePoly("~/adm2.shp",
                                  proj4string=crswgs84,verbose=TRUE)
deu_proj <- spTransform(deu_shp, CRSobj = land_cover@crs)
land_cover_deu <- crop(land_cover, deu_proj )

To understand area estimation with pixel counting, I googled related article and found this : Using remote sensing for crop area estimation and idea that presented in this article is closely match with my interest, but to implement presented method is purely theoretical and hard for me to do it in R. I am quite okay if there is any quick and dirty way to get area estimation with pixel counting where land/soil coverage information (such as city, forest, agricultural land) must be aggregated to each polygon (shapefile with polygons are available on the fly).

for pixel extraction, I could simply use raster::extract as follow (the code down below is trial):

lapply(deu_proj, function(x) {
    pixel_extract = raster::extract(land_cover_deu, deu_proj[x,])
    pixel_extract= as.data.frame(pixel_extract)
})

and above simple pixel extraction is not very efficient for original land cover raster. I don't know how to make pixel counting for the extracted pixel in each polygon and get corresponding area estimation (such as land coverage of the city, forest, agriculture area and so on).

Any idea to make this happen in R? How can I get area estimation with pixel counting approach for given land cover raster? Is that possible to aggregate land/soil coverage information to each polygon? Thanks in advance


Solution

  • Here is a workflow to get the number of pixels of each land cover class by polygons (an operation called tabulate area). The idea is to rasterize the shapefile using the resolution and the extent of the land cover raster. Then, compute the tabulate area using a very efficient function based on data.table that counts the number of pixels of each class, in each polygon.

    # add ID field to the shapefile
    deu_proj@data$ID <- 1:nrow(deu_proj@data)
    
    # extract extent and resolution of land cover raster
    ext <- extent(land_cover_deu)
    ext <- paste(ext[1], ext[3], ext[2], ext[4])
    res <- paste(res(land_cover_deu)[1], res(land_cover_deu)[2])
    
    # rasterize shapefile using gdal (more efficent than rasterize from raster package)
    # you can change GDAL_CACHEMAX according to your RAM
    system(paste0("gdal_rasterize --config GDAL_CACHEMAX 8000 -a ID -te ",
                   ext," -tr ", res,
                   " -ot Int16 /home/deu_proj.shp /home/zones.tif"))
    
    # load raster
    zones <- raster("/home/zones.tif")
    
    # zonal stats using myZonal function
    Zstat <- myZonal(land_cover_deu, zones)
    
    # reshape output    
    results <- data.table::dcast(Zstat, z ~ vals)
    colnames(results)[1] <- "ID"
    
    # merge data
    deu_proj@data <- plyr::join(deu_proj@data, results, by="ID")
    
    # show results
    head(deu_proj@data[, c(8, 17:ncol(deu_proj@data))])
    
               NAME_2  0   1    2    3    4  5   6   7  8   9 10  11    12  15   16    18    20   21 22    23    24    25  26
    1 Alb-Donau-Kreis NA 241 4504  781 1317 NA  NA 357 NA  NA NA 133 57460  NA   NA  7212 19899 1627 NA 16403  6628 15248 135
    2       Böblingen NA 369 4947 1599 1140 NA  NA 149 87 116 98 438 17845  NA 1712  1717  4742 3079 NA  3988  2286 14557  NA
    3     Baden-Baden NA  45  791  150  306 NA  83  33 NA  NA 38  41   695 338  602   721   468   92 NA  1090  2272  4401  NA
    4        Biberach NA 201 4681  462 1264 NA 152 312 NA  48 NA 131 46163  NA   29 23780 19126  920 NA  2291 28679  5140  NA
    5   Bodenseekreis NA 268 3219  561  814  7 169  81 NA  NA NA  76  9482 418 5403  4750 19105  804 NA    96  9151  8797  NA
    6        Bodensee NA  NA   13    1   NA  9  NA  NA NA  NA NA   3    NA  NA   NA     1     2   NA NA    NA    NA     4  NA
      27   29 30 31 32 34   35  36 37 39 40  41 42 43   45 46   128
    1 NA  581 NA NA NA NA  104  NA NA NA NA 397 NA NA 2643 40    NA
    2 NA  801 NA NA NA NA   NA  NA NA NA NA  NA NA NA 2001 58    NA
    3 NA 1117 NA NA NA NA  157  NA NA NA NA  79 NA NA  486  1    NA
    4 NA 2063 NA NA NA NA 1105 273 NA NA NA 384 NA NA 3760 25    NA
    5 NA  240 NA NA NA NA  299  NA NA NA NA 193 NA NA 2476 66    45
    6 NA   NA NA NA NA NA    4  NA NA NA NA 415 NA NA   20  1 27563
    

    Zonal Stats function (adapted from here)

    myZonal <- function (x, z, digits = 0, na.rm = TRUE) { 
      vals <- raster::getValues(x) 
      zones <- round(raster::getValues(z), digits = digits) 
      rDT <- data.table::data.table(vals, z = zones) 
      plyr::count(rDT) }
    

    Sample data

    # I have loaded the shapefile using getData but you can use your own shp.
    # The only difference will be in the column numbers of "show results" step.
    deu_shp <- getData("GADM", country="DEU", level=2)