Search code examples
rtime-seriesgeospatialrasterterra

Aggregate daily raster data by weeks in R


I have daily snow and ice extent raster files for 2022 from NOAA (https://noaadata.apps.nsidc.org/NOAA/G02156/GIS/4km/2022/). I need to get the maximum ice extent per week for the year.

My task is something similar to what's described here, but I need to retain dates for use in the future.

I've tried to use tapply, but I need to retain the start and end dates for the week to rename the new weekly rasters, ideally in the format “YYYYDDDYYYYDDD_ice”. I need use these rasters as a mask for other weekly raster data, so I need to be able to match up dates.

Here's what I'm working with:

library(terra)
library(dplyr)

ICE_dat <- list.files(ICE_path,
                      full.names = TRUE,
                      pattern = ".tif$")

ICE_stack <- rast(ICE_dat)
print(ICE_stack)
#> class       : SpatRaster 
#> dimensions  : 6144, 6144, 365  (nrow, ncol, nlyr)
#> resolution  : 4000, 4000  (x, y)
#> extent      : -12288000, 12288000, -12288000, 12288000  (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=stere +lat_0=90 +lat_ts=60 +lon_0=-80 +x_0=0 +y_0=0 +a=6378137 +rf=291.505347349177 +units=m +no_defs 
#> sources     : ims2022001_4km_GIS_v1.3.tif  
#>               ims2022002_4km_GIS_v1.3.tif  
#>               ims2022003_4km_GIS_v1.3.tif  
#>               ... and 362 more source(s)
#> names       : ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ... 
#> min values  :          ? ,           0,           0,           0,           0,           0, ... 
#> max values  :          ? ,           4,           4,           4,           4,           4, ...

# Extract dates
n <- names(ICE_stack)
year <- as.numeric(substr(n, 4, 7))
doy  <- as.numeric(substr(n, 8, 10))
date <- as.Date(doy, origin=paste(year-1, "-12-31", sep=""))
week <- floor(doy / 7) + 1 #set my own weeks
newweek <- c(0, 1, week[-c(364,365)]) # turn Jan 1 to zero then add another day to week 1 and get rid of 364 and 365 values, which are week 53
myweek <- formatC(newweek, width=2, flag=0)
myweek
#>   [1] "00" "01" "01" "01" "01" "01" "01" "01" "02" "02" "02" "02" "02" "02" "02"

# Assign date to rasters
terra::time(ICE_stack) <- date

# Summarize by week
wk <- tapp(ICE_stack, myweek, max, na.rm=TRUE)

print(wk)
#> class       : SpatRaster 
#> dimensions  : 6144, 6144, 53  (nrow, ncol, nlyr)
#> resolution  : 4000, 4000  (x, y)
#> extent      : -12288000, 12288000, -12288000, 12288000  (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=stere +lat_0=90 +lat_ts=60 +lon_0=-80 +x_0=0 +y_0=0 +a=6378137 +rf=291.505347349177 +units=m +no_defs 
#> source      : spat_198457836cc_408.tif 
#> names       : X00, X01, X02, X03, X04, X05, ... 
#> min values  :   0,   0,   0,   0,   0,   0, ... 
#> max values  :   4,   4,   4,   4,   4,   4, ...
names(wk)
#>  [1] "X00" "X01" "X02" "X03" "X04" "X05" "X06" "X07" "X08" "X09" "X10" "X11"
#> [13] "X12" "X13" "X14" "X15" "X16" "X17" "X18" "X19" "X20" "X21" "X22" "X23"
#> [25] "X24" "X25" "X26" "X27" "X28" "X29" "X30" "X31" "X32" "X33" "X34" "X35"
#> [37] "X36" "X37" "X38" "X39" "X40" "X41" "X42" "X43" "X44" "X45" "X46" "X47"
#> [49] "X48" "X49" "X50" "X51" "X52"

I would like names(wk) to include dates. Start and end dates if possible, or end dates of the week at a minimum. Could the new "rts" or "tidyterra" package hold some solutions?


Solution

  • As far as I can judge, names(wk) simply contains the (aggregation) levels passed to the index argument of terra::tapp(), so you can't have custom interval definitions here natively, but of course you can customize the names of the resulting layers manually afterwards.

    By the way, your approach using myweek in its current state would aggregate data across years when dealing with data from multiple years, so unless this is your intention, you should probably just append the year.

    With data for Jan. 2022 only, this is what you get:

    library(terra)
    #> terra 1.7.78
    library(dplyr)
    
    fnames <- list.files(pattern = "tif$")
    
    r <- rast(fnames)
    
    # set time attribute
    year <- substr(fnames, 4, 7)
    doy <- substr(fnames, 8, 10)
    
    time(r) <- paste0(year, "-", doy) |> strptime("%Y-%j") |> as.Date()
    r
    #> class       : SpatRaster 
    #> dimensions  : 6144, 6144, 31  (nrow, ncol, nlyr)
    #> resolution  : 4000, 4000  (x, y)
    #> extent      : -12288000, 12288000, -12288000, 12288000  (xmin, xmax, ymin, ymax)
    #> coord. ref. : +proj=stere +lat_0=90 +lat_ts=60 +lon_0=-80 +x_0=0 +y_0=0 +a=6378137 +rf=291.505347349177 +units=m +no_defs 
    #> sources     : ims2022001_4km_GIS_v1.3.tif  
    #>               ims2022002_4km_GIS_v1.3.tif  
    #>               ims2022003_4km_GIS_v1.3.tif  
    #>               ... and 28 more source(s)
    #> names       : ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ims20~_v1.3, ... 
    #> time (days) : 2022-01-01 to 2022-01-31
    
    # built index
    week_levels <- time(r) |> format("%Y-%W")
    week_levels
    #>  [1] "2022-00" "2022-00" "2022-01" "2022-01" "2022-01" "2022-01" "2022-01"
    #>  [8] "2022-01" "2022-01" "2022-02" "2022-02" "2022-02" "2022-02" "2022-02"
    #> [15] "2022-02" "2022-02" "2022-03" "2022-03" "2022-03" "2022-03" "2022-03"
    #> [22] "2022-03" "2022-03" "2022-04" "2022-04" "2022-04" "2022-04" "2022-04"
    #> [29] "2022-04" "2022-04" "2022-05"
    
    # aggregate by time
    wk <- tapp(r, week_levels, max, na.rm = TRUE)
    
    # not bad, but let's tidy this a bit
    names(wk)
    #> [1] "X2022.00" "X2022.01" "X2022.02" "X2022.03" "X2022.04" "X2022.05"
    

    I'm pretty sure there is a more elegant way to "fix" the names according to your needs, but it seems like the naive for loop also gets the job done:

    # get individual indices used
    wl <- week_levels |> unique()
    
    for (i in 1:length(wl)) {
      
      # extract relevant year and doy
      ind <- week_levels == wl[i]
      
      y <- year[ind] |> max()
      
      d <- doy[ind] |> range() |> stringr::str_pad(width = 3, side = "left", pad = "0")
      
      # construct your interval definition to be used as layer name 
      rname <- paste0(y, d) |> paste0(collapse = "_")
      
      # and assign this to the layer of your aggregated raster
      names(wk[[i]]) <- rname
    }
    
    names(wk)
    #> [1] "2022001_2022002" "2022003_2022009" "2022010_2022016" "2022017_2022023"
    #> [5] "2022024_2022030" "2022031_2022031"