Search code examples
pythonsumnanthresholdpython-xarray

Sum a daily time series into a monthly time series with a NaN value threshold


I have a 3D time series data matrix from January 1st, 1979 to December 31st, 2005. The matrix is currently 9862x360x720 (daily rainfall x 0.5° latitude x 0.5° longitude). I want to sum the daily rainfall into monthly rainfall (a total of 324 months) while also setting a threshold for summing NaN values.

In other words, If there are more than 10 daily NaN values for a particular lat/lon grid cell, I want to marked the monthly summed cell as NaN. If there are less than 10 daily NaN values for the grid cell, I want to sum the remaining non-NaN daily values and use that as the monthly value.

I had success using the xarray library's "resample" function, but I couldn't figure out a way to set a threshold for NaN values. Everything I've read says to use sum or nansum functions, but I can't find a way to set a NaN threshold through either of those functions. I'm open to any method at this point (xarray or otherwise).

import netCDF4
import numpy as np
import xarray as xr
import pandas as pd

f = netCDF4.Dataset("daily_data", 'r')

daily_dataset = xr.Dataset({'precipitation': (['time', 'lat', 'lon'],  f['precipitation'][:, :, :])},
             coords={'lat': (f['lat'][:]), 'lon': (f['lon'][:]), 'time': pd.date_range('1979-01-01', periods=9862)})

monthly_dataset = daily_dataset['precipitation'].resample('M', dim='time', how='sum', skipna=False)

I was able to sum the daily data to monthly with the above code, but I was not able to set a NaN threshold. The daily data is currently stored in a NetCDF file.


Solution

  • I believe this does what you want:

    NaN = float("nan") # Make a constant for NaN
    
    def sum_nan_threshold(iterable, *, nan_threshold=10):
        if sum(x == NaN for x in iterable) >= nan_threshold: # Are there more NaNs then threshold?
            return NaN
        else:
            return sum(x for x in iterable if x != NaN) # Else sum up if not equal to NaN