I have a dataset that is made up of timestamp readings for pressure that should be recorded and returned every 15 minutes. The example data below shows that there are gaps greater than 15 minutes in the data set
I have been trying to find a way to add in a count column that counts how many readings are missing between received ones, such that a 15-minute interval = 0 (missed readings), a half-hour gap = 1 (missed reading )and a 45-minute gap = 2, etc.
At this stage I don't have any code that I would want to show because I am still a long way off, although I have been working around this post, but still not succeeding. How Can I Detect Gaps and Consecutive Periods In A Time Series In Pandas
Any pointers would be greatly appreciated
Place date pressure (m)
Somewhere 01/09/2019 00:00 34
Somewhere 01/09/2019 00:30 34
Somewhere 01/09/2019 00:45 34
Somewhere 01/09/2019 01:15 34
Somewhere 01/09/2019 01:30 34
Somewhere 01/09/2019 02:15 34
Somewhere 01/09/2019 02:30 34
Somewhere 01/09/2019 02:45 34
Somewhere 01/09/2019 03:15 34
Somewhere 01/09/2019 03:30 34
Somewhere 01/09/2019 03:45 34.5
Somewhere 01/09/2019 04:00 34
Somewhere 01/09/2019 04:15 34
Somewhere 01/09/2019 06:45 33.5
Somewhere 01/09/2019 07:00 33.5
Somewhere 01/09/2019 07:30 34
Use, pd.to_datetime
to convert the date
column into pandas datetime series, then use Series.diff
to calculate the successive differences between the dates, then divide this differences by pd.Timedelta
with interval of 15min
, finally use .fillna
to fill the NaN
values with 0
:
df['date'] = pd.to_datetime(df['date'])
df['gap'] = (df['date'].diff() / pd.Timedelta(minutes=15)).sub(1).fillna(0)
Result:
# print(df)
Place date pressure (m) gap
0 Somewhere 2019-01-09 00:00:00 34.0 0.0
1 Somewhere 2019-01-09 00:30:00 34.0 1.0
2 Somewhere 2019-01-09 00:45:00 34.0 0.0
3 Somewhere 2019-01-09 01:15:00 34.0 1.0
4 Somewhere 2019-01-09 01:30:00 34.0 0.0
5 Somewhere 2019-01-09 02:15:00 34.0 2.0
6 Somewhere 2019-01-09 02:30:00 34.0 0.0
7 Somewhere 2019-01-09 02:45:00 34.0 0.0
8 Somewhere 2019-01-09 03:15:00 34.0 1.0
9 Somewhere 2019-01-09 03:30:00 34.0 0.0
10 Somewhere 2019-01-09 03:45:00 34.5 0.0
11 Somewhere 2019-01-09 04:00:00 34.0 0.0
12 Somewhere 2019-01-09 04:15:00 34.0 0.0
13 Somewhere 2019-01-09 06:45:00 33.5 9.0
14 Somewhere 2019-01-09 07:00:00 33.5 0.0
15 Somewhere 2019-01-09 07:30:00 34.0 1.0