Hy guys
i must be missing something very obvious but,
i have a datetime series with hourly rate. I need to downsample it to daily rate, which is pretty simple using resample('D').
But i cannont downsample it using mean. I need for example to choose one hour of the day (00:00h for example) and use it as the value to the given day.
Before:
datetime values
2018-05-08 00:00:00 0.1
2018-05-08 01:00:00 0.5
2018-05-08 02:00:00 0.7
2018-05-08 03:00:00 0.4
2018-05-08 04:00:00 0.7
Desired Output
datetime values
2018-05-08 0.1
Is there any method in resample or should i use another method?
Best
Edit
first i have big datetime series.
datetime values
2018-05-08 00:00:00 0.1
2018-05-08 01:00:00 0.5
2018-05-08 02:00:00 0.7
2018-05-08 03:00:00 0.4
2018-05-08 04:00:00 0.7
then i have applied a running average mantaining the hourly rate.
df['values'] = df['values'].rolling(168).mean(center=True)
i use 168 because i need 3 days before and 3 days after with hourly rate.
And from here i need to downsample, but if i use the standard resample method it will average it again.
df = df.resample('D').mean()
You can apply whatever function you want. Some of them are just already implemented for you (like mean
, sum
, but also first
and last
):
df.resample('D').first()
# values
# datetime
# 2018-05-08 0.1
But you can just apply any function you want, it will be passed the whole group to operate on, just like groupby
.
This for example takes the last time before 2 am (assuming the dataframe is already sorted by the index):
import datetime
def last_before_2_am(group):
before_2_am = group[group.index.time < datetime.time(2, 0, 0)]
return before_2_am.iloc[-1]
df.resample('D').apply(last_before_2_am)
# values
# datetime
# 2018-05-08 0.5