Search code examples
rmergedataframeextractraster

Extracting data of two overlapping rasters


I am trying extract the data of two overlapping set of rasters (one, a stack of 35 rasters, all from the same source and the second an elevation raster) to get a data.frame of the values (mean of the values) of each pixel of all the rasters.

The description of the raster stack is the following:

> stack_pacifico
class       : RasterStack 
dimensions  : 997, 709, 706873, 35  (nrow, ncol, ncell, nlayers)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -81.62083, -75.7125, 0.3458336, 8.654167  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
names       : F101992.v//ts.avg_vis, F101993.v//ts.avg_vis, F101994.v//ts.avg_vis, F121994.v//ts.avg_vis, F121995.v//ts.avg_vis, F121996.v//ts.avg_vis, F121997.v//ts.avg_vis, F121998.v//ts.avg_vis, F121999.v//ts.avg_vis, F141997.v//ts.avg_vis, F141998.v//ts.avg_vis, F141999.v//ts.avg_vis, F142000.v//ts.avg_vis, F142001.v//ts.avg_vis, F142002.v//ts.avg_vis, ... 
min values  :                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0,                     0, ... 
max values  :                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63,                    63, ... 

And for the elevation raster:

> elevation_pacifico
class       : RasterLayer 
dimensions  : 997, 709, 706873  (nrow, ncol, ncell)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -81.62083, -75.7125, 0.3458336, 8.654167  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
data source : in memory
names       : COL_msk_alt 
values      : -16, 5164  (min, max)

It is my first time with raster data and I want to extract the data by grids of 1km2 (or less). I know the resolution of both rasters can be coerced to fit into that area requirement, also both dimensions are equal, so the number of pixels per raster is the same.

My question is, can I only merge all the rasters (the ones in the stack and the elevation raster) and extract the data with the confidence that all the pixels overlap (or are in the same place)? Or do I have to create a master SpatialGrid or SpatialPixel object and then extract the raster data to these objects?

Thanks in advance,

Data from the raster stack can be downloaded by clicking at this link (if you want to download all the stack, you can use the script in https://github.com/ivanhigueram/nightlights):

http://www.ngdc.noaa.gov/eog/data/web_data/v4composites/

Elevation:

#Download country map and filter by pacific states
colombia_departments <- getData("GADM", download=T, country="CO", level=1)
pacific_littoral <- c(11, 13, 21, 30)
pacific_littoral_map <- colombia_departments[colombia_departments@data$ID_1 %in% pacific_littoral, ]

#Download elevation data and filter it for pacific states
elevation <- getData("alt", co="COL")
elevation_pacifico <- crop(elevation, pacific_littoral_map)
elevation_pacifico <- setExtent(elevation_pacifico, rasters_extent)

Solution

  • If the resolutions, extents and coordinate systems of the two raster objects are identical, then the cells will overlap perfectly. You can confirm this by looking at the coordinates:

    coordinates(stack_pacifico)
    coordinates(elevation_pacifico)
    # are they the same?
    identical(coordinates(stack_pacifico), coordinates(elevation_pacifico))
    

    You can extract all cell values for each object using one of the following:

    as.data.frame(r)
    values(r)
    r[]
    extract(r, seq_len(ncell(r)))
    

    where r is your raster object.

    (These do not all have consistent behaviour for single raster layers - as.data.frame(r) ensures the result is a data.frame, which would have a single column if r is a single raster layer; in contrast the alternatives would return a simple vector if used with a single raster layer.)

    The rows of as.data.frame(stack_pacifico) correspond to cells at the same coordinates as do the rows of as.data.frame(elevation_pacifico) (or, equivalently, the elements ofvalues(elevation_pacifico)`).