I have a pandas dataframe that looks something like this:
Date
31/8/20 4:56:00
31/8/20 4:45:00
31/8/20 4:39:00
31/8/20 4:36:00
31/8/20 4:34:00
31/8/20 4:05:00
31/8/20 4:04:00
31/8/20 3:19:00
31/8/20 2:54:00
I would like it to group rows on a 15-minute basis and label each row if it falls into that 15-minute datetime category. So I'd like my output to look like this:
Date Interval
31/8/20 4:56:00 Period 1
31/8/20 4:45:00 Period 1
31/8/20 4:39:00 Period 2
31/8/20 4:36:00 Period 2
31/8/20 4:34:00 Period 2
31/8/20 4:05:00 Period 4
31/8/20 4:04:00 Period 4
31/8/20 3:19:00 Period 7
31/8/20 2:54:00 Period 9
As you can see 'Period 1' contains all datetimes within the interval of 31/8/20 4:45 - 31/8/20 4:59:00. Period gaps should occur if there is no corresponding row entry for that period. This is why 'Period 4' is not 'Period 3'
I have tried the following:
df = pd.DataFrame()
df = df.sort_values(by=['Date'], ascending=False)
df['Date'] = pd.to_datetime(df['Date'])
df['Interval'] = 'Period ' + (((df.shift()['Date'] - df['Date']).dt.seconds > 900).cumsum() + 1).astype(str)
However, this returns 'Period n' within the 'Interval' column if the datetime row is within 15-minutes of the previous datetime row.
Date Interval
31/8/20 4:56:00 Period 1
31/8/20 4:45:00 Period 2
31/8/20 4:39:00 Period 3
31/8/20 4:36:00 Period 4
31/8/20 4:34:00 Period 5
31/8/20 4:05:00 Period 6
31/8/20 4:04:00 Period 7
31/8/20 3:19:00 Period 8
31/8/20 2:54:00 Period 9
Rather I would like to label each row with a 'Period' regardless of the previous rows datetime and based on four 15-minute intervals within the hour such that Periods are sequentially labelled based on
Thank you in advance.
Use DataFrame.groupby
on Grouper
with frequency of 15min
, then use groupby.ngroup
with optional parameter ascending=False
to number each group:
s = df.groupby(pd.Grouper(freq='15min', key='Date')).ngroup(ascending=False).add(1)
df['Interval'] = 'Period ' + s.astype(str)
Result:
Date Interval
0 2020-08-31 04:56:00 Period 1
1 2020-08-31 04:45:00 Period 1
2 2020-08-31 04:39:00 Period 2
3 2020-08-31 04:36:00 Period 2
4 2020-08-31 04:34:00 Period 2
5 2020-08-31 04:05:00 Period 4
6 2020-08-31 04:04:00 Period 4
7 2020-08-31 03:19:00 Period 7
8 2020-08-31 02:54:00 Period 9