I'm interested in creating two variables from a time-series of spatial raster data in R from an netcdf file. Opening the data, subsetting by Z and creating a mean value is straight forward:
# Working example below using a ~16mb nc file of sea ice from HadISST
library(R.utils)
library(raster)
library(tidyverse)
download.file("https://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST_ice.nc.gz","HadISST_ice.nc.gz")
gunzip("HadISST_ice.nc.gz", ext="gz", FUN=gzfile)
hadISST <- brick('Datasets/HadISST/HadISST_ice.nc')
# subset to a decade time period and create decadal mean from monthly data
hadISST_a <- hadISST %>% subset(., which(getZ(.) >= as.Date("1900-01-01") & getZ(.) <= as.Date("1909-12-31"))) %>% mean(., na.rm = TRUE)
But, I'm interested in extracting 1) annual mean values, and 2) annual mean of three minimum monthly values for the subsetted time period. My current work flow is using nc_open() and ncvar_get() to open the data, raster::extract() to get the values and then tidverse group_by() and slice_min() to get the annual coolest months, but it's a slow and cpu intensive approach. Is there a more effective way of doing this without converting from raster - data.frame?
Questions:
Using the above code how can I extract annual means rather than a mean of ALL months over the decadal period?
Is there a way of using slice_min(order_by = sst, n = 3) or similar with brick objects to get the minimum three values per year prior to annual averaging?
Example data
if (!file.exists("HadISST_ice.nc")) {
download.file("https://www.metoffice.gov.uk/hadobs/hadisst/data/HadISST_ice.nc.gz","HadISST_ice.nc.gz")
R.utils:::gunzip("HadISST_ice.nc.gz")
}
library(terra)
hadISST <- rast('HadISST_ice.nc')
Annual mean
y <- format(time(hadISST), "%Y")
m <- tapp(hadISST, y, mean)
Mean of the lowest three monthly values by year (this takes much longer because a user-defined R function is used). I now see that there is a bug in the CRAN version. You can instead use version 1.5-47 that you can install like this install.packages('terra', repos='https://rspatial.r-universe.dev')
.
f <- function(i) mean(sort(i)[1:3])
m3 <- tapp(hadISST, y, f)
To make this faster (if you have multiple cores):
m3 <- tapp(hadISST, y, f, cores=4)