Search code examples
pythonpandaspandas-groupby

Groupby specific hour interval per day


I have a data frame that looks something like:

df = 

                     val1     val2
date
----------------------------------
2021-06-01 00:00     1        3
2021-06-01 01:00     2        4
2021-06-01 02:00     3        5
2021-06-01 03:00     4        6
2021-06-01 04:00     5        7
2021-06-02 00:00     6        8
2021-06-02 01:00     7        9
2021-06-02 02:00     8        10
2021-06-02 03:00     9        11
2021-06-02 04:00     10       12
2021-06-03 00:00     11       13
2021-06-03 01:00     12       14
2021-06-03 02:00     13       15
2021-06-03 03:00     14       16
2021-06-03 04:00     15       17

If I then do something like:

df_new = df.groupby(df.index.date).mean()

The resulting data frame will be:

df_new = 

                     val1     val2
date
----------------------------------
2021-06-01           3        5
2021-06-02           8        10
2021-06-03           13       15

However, I would like this to happen only for a specific time interval for each day. So in this case maybe only from 00:00-03:00 and not the 04:00 hour. Hence, the resulting data frame should instead be:

df_true = 

                     val1     val2
date
----------------------------------
2021-06-01           2.5      4.5
2021-06-02           7.5      9.5
2021-06-03           12.5     14.5

Solution

  • First idea is filter rows by boolean indexing and then use Grouper in groupby or DataFrame.resample:

    df_new = df[df.index.hour < 4].groupby(pd.Grouper(freq='D')).mean()
    df_new = df[df.index.hour < 4].resample('D').mean()
    print (df_new)
                val1  val2
    date                  
    2021-06-01   2.5   4.5
    2021-06-02   7.5   9.5
    2021-06-03  12.5  14.5
    

    Or use DataFrame.between_time:

    df_new = df.between_time('0:00', '3:00').groupby(pd.Grouper(freq='D')).mean()
    df_new = df.between_time('0:00', '3:00').resample('D').mean()
    print (df_new)
                val1  val2
    date                  
    2021-06-01   2.5   4.5
    2021-06-02   7.5   9.5
    2021-06-03  12.5  14.5