Here are two dataframes grouped how I want them:
last5s = pd.Timestamp.now().replace(microsecond=0) - pd.Timedelta('5s')
dates = pd.date_range(last5s, periods = 5, freq='s')
N=10
data1 = np.random.randint(0,10,N)
data2 = np.random.randint(0,10,N)
df1 = pd.DataFrame({'timestamp': np.random.choice(dates, size=N), 'A': data1})
df2 = pd.DataFrame({'timestamp': np.random.choice(dates, size=N), 'B': data2})
print(df1)
print(df2)
print()
g1 = df1.groupby(pd.Grouper(key='timestamp', freq='1s'))
print("g1:")
for time, group in g1:
print('time:', time)
print(group)
print()
print()
g2 = df2.groupby(pd.Grouper(key='timestamp', freq='1s'))
print('g2:')
for time, group in g2:
print('time:', time)
print(group)
print()
Output (e.g.):
timestamp A
0 2024-03-01 10:05:26 7
1 2024-03-01 10:05:25 8
2 2024-03-01 10:05:28 1
3 2024-03-01 10:05:24 2
4 2024-03-01 10:05:28 5
5 2024-03-01 10:05:27 4
6 2024-03-01 10:05:24 6
7 2024-03-01 10:05:26 3
8 2024-03-01 10:05:26 8
9 2024-03-01 10:05:28 8
timestamp B
0 2024-03-01 10:05:25 1
1 2024-03-01 10:05:26 6
2 2024-03-01 10:05:25 5
3 2024-03-01 10:05:28 7
4 2024-03-01 10:05:27 7
5 2024-03-01 10:05:28 1
6 2024-03-01 10:05:28 4
7 2024-03-01 10:05:25 0
8 2024-03-01 10:05:24 6
9 2024-03-01 10:05:24 5
g1:
time: 2024-03-01 10:05:24
timestamp A
3 2024-03-01 10:05:24 2
6 2024-03-01 10:05:24 6
time: 2024-03-01 10:05:25
timestamp A
1 2024-03-01 10:05:25 8
time: 2024-03-01 10:05:26
timestamp A
0 2024-03-01 10:05:26 7
7 2024-03-01 10:05:26 3
8 2024-03-01 10:05:26 8
time: 2024-03-01 10:05:27
timestamp A
5 2024-03-01 10:05:27 4
time: 2024-03-01 10:05:28
timestamp A
2 2024-03-01 10:05:28 1
4 2024-03-01 10:05:28 5
9 2024-03-01 10:05:28 8
g2:
time: 2024-03-01 10:05:24
timestamp B
8 2024-03-01 10:05:24 6
9 2024-03-01 10:05:24 5
time: 2024-03-01 10:05:25
timestamp B
0 2024-03-01 10:05:25 1
2 2024-03-01 10:05:25 5
7 2024-03-01 10:05:25 0
time: 2024-03-01 10:05:26
timestamp B
1 2024-03-01 10:05:26 6
time: 2024-03-01 10:05:27
timestamp B
4 2024-03-01 10:05:27 7
time: 2024-03-01 10:05:28
timestamp B
3 2024-03-01 10:05:28 7
5 2024-03-01 10:05:28 1
6 2024-03-01 10:05:28 4
How do I "join" the groups together such that I can iterate over them together? E.g. I want to be able to do:
for time, group1, group2 in somehow_joined(g1,g2):
<do stuff with group1 and group2 in this common time group>
You can just do:
for t, d1 in g1:
d2 = g2.get_group(t)
if d2 is None:
print("I don't want this")
continue
print(d1)
print('-'*10)
print(d2)
print('='*30)
Output:
timestamp A
0 2024-02-29 19:10:14 0
3 2024-02-29 19:10:14 7
7 2024-02-29 19:10:14 1
----------
timestamp B
1 2024-02-29 19:10:14 0
3 2024-02-29 19:10:14 6
5 2024-02-29 19:10:14 9
6 2024-02-29 19:10:14 4
==============================
timestamp A
2 2024-02-29 19:10:15 2
5 2024-02-29 19:10:15 8
6 2024-02-29 19:10:15 2
9 2024-02-29 19:10:15 6
----------
timestamp B
8 2024-02-29 19:10:15 9
==============================
timestamp A
1 2024-02-29 19:10:16 3
4 2024-02-29 19:10:16 9
8 2024-02-29 19:10:16 6
----------
timestamp B
2 2024-02-29 19:10:16 6
4 2024-02-29 19:10:16 6
==============================