I have a timeseries dataframe where there are alerts for some particular rows. The dataframe looks like-
machineID time vibration alerts
1 2023-02-15 220 1
11:45
1 2023-02-15 221 0
12:00
1 2023-02-15 219 0
12:15
1 2023-02-15 220 1
12:30
1 2023-02-16 220 1
11:45
1 2023-02-16 221 1
12:00
1 2023-02-16 219 0
12:15
1 2023-02-16 220 1
12:30
I want to calculate difference of alerts
columns for each day. But since the date
column is in time interval of 15 minutes, I am not getting how to group for whole day i.e., sum the alerts for each day and compare it with the sum of all alerts of the previous day.
In short, I need a way to sum all alerts for each day and substract with previous day. The result should be in another dataframe where there is a date column and difference of alerts column. In this case, the new dataframe will be-
time diff_alerts
2023-02-16 1
since there is difference of 1 alert on the next day i.e. 16-02-2023
Group by day with a custom pd.Grouper
then sum alerts and finally compute the diff with the previous day:
>>> (df.groupby(pd.Grouper(key='time', freq='D'))['alerts'].sum().diff()
.dropna().rename('diff_alerts').astype(int).reset_index())
time diff_alerts
0 2023-02-16 1
Note: the second line of code is just here to have a clean output.