Search code examples
pythonpython-3.xpandasdownsampling

Python - Downsample using resample not using average/mean


Hy guys

i must be missing something very obvious but,
i have a datetime series with hourly rate. I need to downsample it to daily rate, which is pretty simple using resample('D').
But i cannont downsample it using mean. I need for example to choose one hour of the day (00:00h for example) and use it as the value to the given day. Before:

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

Desired Output

datetime              values
2018-05-08             0.1

Is there any method in resample or should i use another method?

Best

Edit

first i have big datetime series.

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

then i have applied a running average mantaining the hourly rate.

df['values'] = df['values'].rolling(168).mean(center=True)   

i use 168 because i need 3 days before and 3 days after with hourly rate.
And from here i need to downsample, but if i use the standard resample method it will average it again.

df = df.resample('D').mean()

Solution

  • You can apply whatever function you want. Some of them are just already implemented for you (like mean, sum, but also first and last):

    df.resample('D').first()
    #             values
    # datetime          
    # 2018-05-08     0.1
    

    But you can just apply any function you want, it will be passed the whole group to operate on, just like groupby.

    This for example takes the last time before 2 am (assuming the dataframe is already sorted by the index):

    import datetime
    
    def last_before_2_am(group):
        before_2_am = group[group.index.time < datetime.time(2, 0, 0)]
        return before_2_am.iloc[-1]
    
    df.resample('D').apply(last_before_2_am)
    #             values
    # datetime          
    # 2018-05-08     0.5