Search code examples
pythonpandasdatetimepandas-resample

Prevent pandas resample from filling in gap day while downsampling


I have a pandas Series with measurements at a 1 minute interval. I want to downsample this data to a 5 minute interval. series contains measurements from at end of October 18th, none from October 19th and then measurements at the start of October 20th. Using series.resample("5T").mean() fills October 19th with NaN's, and series.resample("5T").sum() fills the missing day with 0's:

index1 = pd.date_range("2023-10-18 23:50", "2023-10-18 23:59", freq="T")
index2 = pd.date_range("2023-10-20 00:00", "2023-10-20 00:10", freq="T")
series1 = pd.Series(range(len(index1)), index=index1)
series2 = pd.Series(range(100, len(index2)+100), index=index2)
series = pd.concat([series1, series2])

series.resample("5T").mean()

Out:

2023-10-18 23:50:00      2.0
2023-10-18 23:55:00      7.0
2023-10-19 00:00:00      NaN
2023-10-19 00:05:00      NaN
2023-10-19 00:10:00      NaN
                       ...  
2023-10-19 23:50:00      NaN
2023-10-19 23:55:00      NaN
2023-10-20 00:00:00    102.0
2023-10-20 00:05:00    107.0
2023-10-20 00:10:00    110.0
Freq: 5T, Length: 293, dtype: float64

I need pd.Series.resample to stick to the days that are in series and not fill in anything for the missing day. How can this be done?


Solution

  • You could consider grouping by the date first, then resampling.

    series.groupby(series.index.date).resample("5T").mean()
    
    2023-10-18  2023-10-18 23:50:00      2.0
                2023-10-18 23:55:00      7.0
    2023-10-20  2023-10-20 00:00:00    102.0
                2023-10-20 00:05:00    107.0
                2023-10-20 00:10:00    110.0
    dtype: float64
    

    Add .droplevel(0) if you don't want the date in the output.