Search code examples
rr-rasterchron

How to filter HadISST raster data in a range of months in R?


I'm trying to study the Sea Surface Temperature (SST) correlation with tropical-cyclone activity in a certain range of months. The data I'm using is from the Hadley Centre (in NetCDF format) using the get_anual_ssts() function from the hadsstR package.

get_annual_ssts <- function(hadsst_raster, years = 1969:2011) {
    mean_rasts <-
        apply(matrix(years), 1, function(x) {
            yearIDx <- which(chron::years(hadsst_raster@z$Date) == x)
            subset_x <- raster::subset(hadsst_raster, yearIDx)
            means <- raster::calc(subset_x, mean, na.rm = TRUE)
            names(means) <- as.character(x)
            return(means)
        })
    mean_brick <- raster::brick(mean_rasts)
    mean_brick <- raster::setZ(mean_brick, as.Date(paste0(years, '-01-01')), 'Date')
    return(mean_brick)
}

What I need is to have an additional parameter that allows me to filter by months of hurricane activity instead of calculating the whole year mean SST.

For instance, for the Southwest Pacific Ocean, I should be able to call get_annual_ssts(hadsst_raster, 12:04, 1966:2007), being December-April the months of activity of hurricane activity. Setting a range of months that comprise two different years would be crucial (maybe stating the initial month and the range length to ease the structure of mean_brick, saving the mean on the initial year?).

Looking at chron's documentation, it doesn't seem possible to assign a subset of mm-yy or something similar. What would be the best way to accomplish this?

Here's how the input raster data (hadsst_raster) looks like, for reference:

class       : RasterBrick 
dimensions  : 180, 360, 64800, 1766  (nrow, ncol, ncell, nlayers)
resolution  : 1, 1  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 
data source : ~/Downloads/Hadley/HadISST_sst.nc 
names       : X1870.01.16, X1870.02.14, X1870.03.16, X1870.04.15, X1870.05.16, X1870.06.16, X1870.07.16, X1870.08.16, X1870.09.16, X1870.10.16, X1870.11.16, X1870.12.16, X1871.01.16, X1871.02.15, X1871.03.16, ... 
Date        : 1870-01-16, 2017-02-16 (min, max)
varname     : sst 

And how the output (get_annual_ssts(hadsst_raster, 1966:2007)) looks like:

class       : RasterBrick 
dimensions  : 180, 360, 64800, 42  (nrow, ncol, ncell, nlayers)
resolution  : 1, 1  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 
data source : in memory
names       :      X1966,      X1967,      X1968,      X1969,      X1970,      X1971,      X1972,      X1973,      X1974,      X1975,      X1976,      X1977,      X1978,      X1979,      X1980, ... 
min values  :  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167, -1000.0000, -1000.0000, ... 
max values  :   29.94996,   29.66276,   29.70941,   30.22522,   29.61913,   29.43723,   29.65050,   29.73929,   29.59117,   29.48381,   29.36425,   29.72932,   29.70908,   29.84216,   29.84868, ... 
Date        : 1966-01-01, 2007-01-01 (min, max)

Solution

  • Ok, I've got a little something. Maybe you an use it to modify your function:

    ## Generate your layer names (used for indexing later)
    
    nms <- expand.grid(paste0('X',1969:2011),c("01","02","03","04","05","06","07","08","09","10","11","12"),'16')
    
    nms <- apply(nms,1,function(x) paste0(x,collapse = '.'))
    
    nms <- sort(nms)
    
    ## Generating fake raster brick
    
    r <- raster()
    r[] <- runif(ncell(r))
    
    rst <- lapply(1:length(nms),function(x) r)
    
    rst <- do.call(brick,rst)
    
    names(rst) <- nms
    

    And now you can index the brick with the layer names. Loop through the Hurricane Seasons (starting with Year1 -1):

    for (ix in 1970:2011){
    
    
      sel <- rst[[c(grep(paste0(ix-1,'.12'),nms),sapply(paste0(0,1:4),function(x) grep(paste0(ix,'.',x),nms)))]]
    
    
      break ## in case you don't want to go through all iterations
    
      }
    

    For the first iteration, I'm getting this output:

    > sel
    class       : RasterStack 
    dimensions  : 180, 360, 64800, 5  (nrow, ncol, ncell, nlayers)
    resolution  : 1, 1  (x, y)
    extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
    coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
    names       :  X1969.12.16,  X1970.01.16,  X1970.02.16,  X1970.03.16,  X1970.04.16 
    min values  : 5.988637e-06, 5.988637e-06, 5.988637e-06, 5.988637e-06, 5.988637e-06 
    max values  :    0.9999771,    0.9999771,    0.9999771,    0.9999771,    0.9999771 
    

    Let me know if that's helpful.


    Edit:

    So maybe a more applicable example:

    (the function assumes your layer names of your input brick x have the format Xyyyy.mm.dd)

    hadSSTmean <- function(x, years, first.range = 11:12, second.range = 1:4){
    
      nms <- names(x)
    
      mts <- c("01","02","03","04","05","06","07","08","09","10","11","12")
    
      xMeans <- vector(length = length(years)-1,mode='list')
    
      for (ix in 2:length(years){
    
        xMeans[[ix-1]] <- mean(x[[c(sapply(first.range,function(x) grep(paste0(years[ix-1],'.',mts[x]),nms)),sapply(1:4,function(x) grep(paste0(years[ix],'.',mts[x]),nms)))]])
    
      }
    
      return(do.call(brick,xMeans))
      # you could also return the list instead of a single brick
    }