Search code examples
pythonnumpynetcdfweathernoaa

Calculating 30 year climate normal from gridded dataset in Python


I am trying to calculate the 30 year temperature normal (1981-2010 average) for the NARR daily gridded data set linked below.

In the end for each grid point I want an array that contains 365 values, each of which contains the average temperature of that day calculated from the 30 years of data for that day. For example the first value in each grid point's array would be the average Jan 1 temperature calculated from the 30 years (1981-2010) of Jan 1 temperature data for that grid point. My end goal is to be able to use this new 30yrNormal array to calculate daily temperature anomalies from.

So far I have only been able to calculate anomalies from one year worth of data. The problem with this is that it is taking the difference between the daily temperature and the average for the whole year rather then the difference between the daily temperature and the 30 year average of that daily temperature:

file='air.sfc.2018.nc'
ncin = Dataset(file,'r')
#put data into numpy arrays
lons=ncin.variables['lon'][:]
lats=ncin.variables['lat'][:]
lats1=ncin.variables['lat'][:,0]
temp=ncin.variables['air'][:]
ncin.close()

AvgT=np.mean(temp[:,:,:],axis=0)
#compute anomalies by removing time-mean
T_anom=temp-AvgT

Data: ftp://ftp.cdc.noaa.gov/Datasets/NARR/Dailies/monolevel/ For the years 1981-2010


Solution

  • This is most easily solved using CDO.

    You can use my package, nctoolkit (https://nctoolkit.readthedocs.io/en/latest/ & https://pypi.org/project/nctoolkit/) if you are working with Python on Linux. This uses CDO as a backend.

    Assuming that the 30 files are a list called ff_list. The code below should work.

    First you would create the 30 year daily mean climatology.

    import nctoolkit as nc
    mean_30 = nc.open_data(ff_list)
    mean_30.merge_time()
    mean_30.drop(month=2,day=29)
    mean_30.tmean("day")
    mean_30.run()
    

    Then you would subtract this from the daily figures to get the anomalies.

    anom_30 = nc.open_data(ff_list)
    anom_30.cdo_command("del29feb")
    anom_30.subtract(mean_30)
    anom_30.run()
    

    This should have the anomalies

    One issue is whether the files have leap years or how you want to handle leap years if they exists. CDO has an undocumented command -delfeb29, which I have used above