Search code examples
raveragedata-conversionera5

Merge hourly ERA5 netcdf files into daily files with R


I have hourly data for maximum temperature from 1970-2022 for all month in NetCDF format (NetCDF of each month contains data of 24 hours from 1970-2022). Can anyone help me out in aggregating hourly data to daily data with R? I have already tried it by directly converting the hourly data to daily and downloading it but it is unable to download such large dataset. (here's the code that I had tried).

wf_set_key(service = "cds") 
data=c.retrieve(
'reanalysis-era5-single-levels',
{
'product_type': 'reanalysis',
'variable': 'maximum_2m_temperature_since_previous_post_processing',
'year': [
  '1970', '1971', '1972',
  '1973', '1974', '1975',
  '1976', '1977', '1978',
  '1979', '1980', 
],
'month': [
  '03','04',
  '05', '06',
],
'day': [
  '01', '02', '03',
  '04', '05', '06',
  '07', '08', '09',
  '10', '11', '12',
  '13', '14', '15',
  '16', '17', '18',
  '19', '20', '21',
  '22', '23', '24',
  '25', '26', '27',
  '28', '29', '30','31',
],
'time': [
  '00:00', '01:00', '02:00',
  '03:00', '04:00', '05:00',
  '06:00', '07:00', '08:00',
  '09:00', '10:00', '11:00',
  '12:00', '13:00', '14:00',
  '15:00', '16:00', '17:00',
  '18:00', '19:00', '20:00',
  '21:00', '22:00', '23:00',
],
'area': [
  38, 67, 6,
  99
],
'format': 'netcdf',
 },
 'day_mean'=ct.climate.daily_mean(data,keep_attrs=True)
 if count == 1:
 day_mean_all=day_mean
 else:       
  day_mean_all=ct.cube.concat([day_mean_all, day_mean], dim='time')
  count = count + 1
   return day_mean_all
   'download.nc')

I am trying to aggregate the hourly data to daily of that month in R.

library(ncdf4) 
ncpath <- "D:/MAX_TEMP/" 
ncname <- "adaptor.mars.internal-1681202164.1038315-25242-15-2a718a58-dcd5-4470-9fd2-ddbdede30875_march"   
ncfname <- paste(ncpath, ncname, ".nc", sep="") 
ncin <- nc_open(ncfname) 
print(ncin) 
library(dplyr) 
a1<-ncname %>%
  group_by(time) %>%
    summarize(Mean_Max_Temp = mean(expver))
#Error in UseMethod("group_by")

Solution

  • Please see the nice explanation given by Robert Hijmans here

    I suppose you are familiar with raster brick function to read your netcdf file (Could use package terra as well, but stick to raster::brick for now). You will then need to group the layers by day-mon in other to aggregate over the time dimension, which is hourly in your netcdf.

    Finally, use the stackApply () function as shown in the above link. Hope this helps!