Search code examples
pythonnetcdfncocdo-climate

converting an accumulated variable to timestep values in a netcdf file with CDO


I have a netcdf-file with about 100 timesteps on a grid with one variable, which is accumulated over the timesteps. I am now interested in calculating the contribution of each timestep to the variable's value (i.e. the difference of consecutive timesteps).

Currently I use the following sequence:

  1. To extract every single timestep into a new file I use cdo seltimestep,$i ...,
  2. calculate each difference into a new file with cdo sub $i ${i-1} ...
  3. and merge those new files in the end with cdo mergetime ... into one single result file.

That seems to me to be very cumbersome and not ideal regarding to performance. Because of the amount of timesteps I cannot use a cdo pipeline and need to create many files in the meantime therefore.

Is there one better solution to convert an accumulated variable to timestep values with cdo (or something else like nco/ncl?)


Solution

  • numpy's diff computes the difference of consecutive entries.

    I suspect you have a multi-dimension variable in your file, so here is a generic example of how to do it:

    import netCDF4
    import numpy as np
    
    ncfile = netCDF4.Dataset('./myfile.nc', 'r')
    var = ncfile.variables['variable'][:,:,:] # [time x lat x lon]
    
    # Differences with a step of 1 along the 'time' axis (0) 
    var_diff = np.diff(var, n=1, axis=0) 
    ncfile.close()
    
    # Write out the new variable to a new file     
    ntim, nlat, nlon = np.shape(var_diff)
    
    ncfile_out = netCDF4.Dataset('./outfile.nc', 'w')
    ncfile_out.createDimension('time', ntim)
    ncfile_out.createDimension('lat', nlat)
    ncfile_out.createDimension('lon', nlon)
    var_out = ncfile_out.createVariable('variable', 'f4', ('time', 'lat', 'lon',))
    var_out[:,:,:] = var_diff[:,:,:]
    ncfile_out.close()