I have a DataFrame that looks like this:
df
Pression
13/01/2022 09:01:02 3500
13/01/2022 09:01:13 3650
13/01/2022 09:01:24 4248
13/01/2022 09:01:35 4259
13/01/2022 09:01:46 4270
13/01/2022 11:43:28 345
13/01/2022 11:43:39 478
13/01/2022 11:43:50 589
15/01/2022 08:31:14 2048
15/01/2022 08:31:25 2574
15/01/2022 08:31:36 3659
15/01/2022 08:31:47 3784
15/01/2022 08:31:58 3968
15/01/2022 08:32:09 4009
I want to create sub-dataframe based on timestep, if the timestep is superior to 11 secondes
it means that a sub-dataframe is created:
EXPECTED OUTPUT:
df_1
Pression
13/01/2022 09:01:02 3500
13/01/2022 09:01:13 3650
13/01/2022 09:01:24 4248
13/01/2022 09:01:35 4259
13/01/2022 09:01:46 4270
df_2
Pression
13/01/2022 11:43:28 345
13/01/2022 11:43:39 478
13/01/2022 11:43:28 345
13/01/2022 11:43:39 478
13/01/2022 11:43:50 589
df_3
Pression
15/01/2022 08:31:14 2048
15/01/2022 08:31:25 2574
15/01/2022 08:31:36 3659
15/01/2022 08:31:47 3784
15/01/2022 08:31:58 3968
15/01/2022 08:32:09 4009
How can I do this without using loop? I'm already using list of dataframe, this could be computationally expensive.
Create list of DataFrames:
df.index = pd.to_datetime(df.index)
#convert DatetimeIndex to Series for possible get difference
s = df.index.to_series()
#create groups with cumulative sum
dfs = dict(tuple(df.groupby(s.diff().gt('11 s').cumsum())))
print (dfs[0])
Pression
2022-01-13 09:01:02 3500
2022-01-13 09:01:13 3650
2022-01-13 09:01:24 4248
2022-01-13 09:01:35 4259
2022-01-13 09:01:46 4270
print (dfs[1])
Pression
2022-01-13 11:43:28 345
2022-01-13 11:43:39 478
2022-01-13 11:43:50 589
print (dfs[2])
Pression
2022-01-15 08:31:14 2048
2022-01-15 08:31:25 2574
2022-01-15 08:31:36 3659
2022-01-15 08:31:47 3784
2022-01-15 08:31:58 3968
2022-01-15 08:32:09 4009