Search code examples
pythonpandasdatetimegroup-by

How to group a pandas dataframe by a defined time interval?


I have a dataFrame like this, I would like to group every 60 minutes and start grouping at 06:30.

                           data
index
2017-02-14 06:29:57    11198648
2017-02-14 06:30:01    11198650
2017-02-14 06:37:22    11198706
2017-02-14 23:11:13    11207728
2017-02-14 23:21:43    11207774
2017-02-14 23:22:36    11207776

I am using:

df.groupby(pd.TimeGrouper(freq='60Min'))

I get this grouping:

                      data
index       
2017-02-14 06:00:00     x1
2017-02-14 07:00:00     x2
2017-02-14 08:00:00     x3
2017-02-14 09:00:00     x4
2017-02-14 10:00:00     x5

but I am looking for this result:

                      data
index       
2017-02-14 06:30:00     x1
2017-02-14 07:30:00     x2
2017-02-14 08:30:00     x3
2017-02-14 09:30:00     x4
2017-02-14 10:30:00     x5

How can I tell the function to start grouping at 6:30 at one-hour intervals?

If it can not be done by the .groupby(pd.TimeGrouper(freq='60Min')), how is the best way to do it?

A salute and thanks very much in advance


Solution

  • Use base=30 in conjunction with label='right' parameters in pd.Grouper.

    Specifying label='right' makes the time-period to start grouping from 6:30 (higher side) and not 5:30. Also, base is set to 0 by default, hence the need to offset those by 30 to account for the forward propagation of dates.

    Suppose, you want to aggregate the first element of every sub-group, then:

    df.groupby(pd.Grouper(freq='60Min', base=30, label='right')).first()
    # same thing using resample - df.resample('60Min', base=30, label='right').first()
    

    yields:

                               data
    index                          
    2017-02-14 06:30:00  11198648.0
    2017-02-14 07:30:00  11198650.0
    2017-02-14 08:30:00         NaN
    2017-02-14 09:30:00         NaN
    2017-02-14 10:30:00         NaN
    2017-02-14 11:30:00         NaN
    2017-02-14 12:30:00         NaN
    2017-02-14 13:30:00         NaN
    2017-02-14 14:30:00         NaN
    2017-02-14 15:30:00         NaN
    2017-02-14 16:30:00         NaN
    2017-02-14 17:30:00         NaN
    2017-02-14 18:30:00         NaN
    2017-02-14 19:30:00         NaN
    2017-02-14 20:30:00         NaN
    2017-02-14 21:30:00         NaN
    2017-02-14 22:30:00         NaN
    2017-02-14 23:30:00  11207728.0