I have a df of indices corresponding to special events that occur in each trial in a study. The indices are really pairs (2 per trial) that specify the beginning and end of the special events.
event_df.head(16)
subject trial indices
0 1 1 7
1 1 1 14
2 1 2 28
3 1 2 33
4 2 1 40
5 2 1 48
6 2 2 62
7 2 2 67
8 3 1 80
9 3 1 86
10 3 2 93
11 3 2 99
12 4 1 113
13 4 1 120
14 4 2 130
15 4 2 139
I have another df with spatial coordinates of the participants over the entire experiment, as such -
player_df.head(72)
subject trial x_coords y_coords t_coords
0 1 1 2 1 100
1 1 1 2 2 198
2 1 1 3 1 284
3 1 1 4 2 376
4 1 1 5 3 469
5 1 1 5 3 560
6 1 1 4 2 651
7 1 1 6 1 748
8 1 1 5 2 851
9 1 1 6 2 949
10 1 1 7 3 1056
11 1 1 8 4 1160
12 1 1 2 1 102
13 1 1 2 2 201
14 1 1 3 1 300
15 1 1 4 2 395
16 1 1 5 3 489
17 1 1 5 3 591
18 1 2 4 2 684
19 1 2 6 1 778
20 1 2 5 2 871
21 1 2 6 2 977
22 1 2 7 3 1070
23 1 2 8 4 1168
24 1 2 2 1 101
25 1 2 2 2 203
26 1 2 3 1 305
27 1 2 4 2 404
28 1 2 5 3 499
29 1 2 5 3 596
30 1 2 4 2 692
31 1 2 6 1 798
32 1 2 5 2 893
33 1 2 6 2 995
34 1 2 7 3 1089
35 1 2 8 4 1194
36 2 1 2 1 100
37 2 1 2 2 198
38 2 1 3 1 284
39 2 1 4 2 376
40 2 1 5 3 469
41 2 1 5 3 560
42 2 1 4 2 651
43 2 1 6 1 748
44 2 1 5 2 851
45 2 1 6 2 949
46 2 1 7 3 1056
47 2 1 8 4 1160
48 2 1 2 1 102
49 2 1 2 2 201
50 2 1 3 1 300
51 2 1 4 2 395
52 2 1 5 3 489
53 2 1 5 3 591
54 2 2 4 2 684
55 2 2 6 1 778
56 2 2 5 2 871
57 2 2 6 2 977
58 2 2 7 3 1070
59 2 2 8 4 1168
60 2 2 2 1 101
61 2 2 2 2 203
62 2 2 3 1 305
63 2 2 4 2 404
64 2 2 5 3 499
65 2 2 5 3 596
66 2 2 4 2 692
67 2 2 6 1 798
68 2 2 5 2 893
69 2 2 6 2 995
70 2 2 7 3 1089
71 2 2 8 4 1194
72 3 1 2 1 100
73 3 1 2 2 198
74 3 1 3 1 284
75 3 1 4 2 376
76 3 1 5 3 469
77 3 1 5 3 560
78 3 1 4 2 651
79 3 1 6 1 748
80 3 1 5 2 851
81 3 1 6 2 949
82 3 1 7 3 1056
83 3 1 8 4 1160
84 3 1 2 1 102
85 3 1 2 2 201
86 3 1 3 1 300
87 3 1 4 2 395
88 3 1 5 3 489
89 3 1 5 3 591
90 3 2 4 2 684
91 3 2 6 1 778
92 3 2 5 2 871
93 3 2 6 2 977
94 3 2 7 3 1070
95 3 2 8 4 1168
96 3 2 2 1 101
97 3 2 2 2 203
98 3 2 3 1 305
99 3 2 4 2 404
100 3 2 5 3 499
101 3 2 5 3 596
102 3 2 4 2 692
103 3 2 6 1 798
104 3 2 5 2 893
105 3 2 6 2 995
106 3 2 7 3 1089
107 3 2 8 4 1194
108 4 1 2 1 100
109 4 1 2 2 198
110 4 1 3 1 284
111 4 1 4 2 376
112 4 1 5 3 469
113 4 1 5 3 560
114 4 1 4 2 651
115 4 1 6 1 748
116 4 1 5 2 851
117 4 1 6 2 949
118 4 1 7 3 1056
119 4 1 8 4 1160
120 4 1 2 1 102
121 4 1 2 2 201
122 4 1 3 1 300
123 4 1 4 2 395
124 4 1 5 3 489
125 4 1 5 3 591
126 4 2 4 2 684
127 4 2 6 1 778
128 4 2 5 2 871
129 4 2 6 2 977
130 4 2 7 3 1070
131 4 2 8 4 1168
132 4 2 2 1 101
133 4 2 2 2 203
134 4 2 3 1 305
135 4 2 4 2 404
136 4 2 5 3 499
137 4 2 5 3 596
138 4 2 4 2 692
139 4 2 6 1 798
140 4 2 5 2 893
141 4 2 6 2 995
142 4 2 7 3 1089
143 4 2 8 4 1194
I would like to use the indices column in the first df (event_df) and extract all of the rows in between (and including) those indices in another dataframe, player_df on a trial by trial basis.
The expected output would be all of the rows between the sequential pairs (per trial only). Note that I don't want rows between all sequential pairs - only those corresponding to the same trial number.
filtered_df
subject trial x_coords y_coords t_coords
7 1 1 6 1 748
8 1 1 5 2 851
9 1 1 6 2 949
10 1 1 7 3 1056
11 1 1 8 4 1160
12 1 1 2 1 102
13 1 1 2 2 201
14 1 1 3 1 300
28 1 2 5 3 499
29 1 2 5 3 596
30 1 2 4 2 692
31 1 2 6 1 798
32 1 2 5 2 893
33 1 2 6 2 995
40 2 1 5 3 469
41 2 1 5 3 560
42 2 1 4 2 651
43 2 1 6 1 748
44 2 1 5 2 851
45 2 1 6 2 949
46 2 1 7 3 1056
47 2 1 8 4 1160
48 2 1 2 1 102
62 2 2 3 1 305
63 2 2 4 2 404
64 2 2 5 3 499
65 2 2 5 3 596
66 2 2 4 2 692
67 2 2 6 1 798
80 3 1 5 2 851
81 3 1 6 2 949
82 3 1 7 3 1056
83 3 1 8 4 1160
84 3 1 2 1 102
85 3 1 2 2 201
86 3 1 3 1 300
93 3 2 6 2 977
94 3 2 7 3 1070
95 3 2 8 4 1168
96 3 2 2 1 101
97 3 2 2 2 203
98 3 2 3 1 305
99 3 2 4 2 404
113 4 1 5 3 560
114 4 1 4 2 651
115 4 1 6 1 748
116 4 1 5 2 851
117 4 1 6 2 949
118 4 1 7 3 1056
119 4 1 8 4 1160
120 4 1 2 1 102
130 4 2 7 3 1070
131 4 2 8 4 1168
132 4 2 2 1 101
133 4 2 2 2 203
134 4 2 3 1 305
135 4 2 4 2 404
136 4 2 5 3 499
137 4 2 5 3 596
138 4 2 4 2 692
139 4 2 6 1 798
I've tried filtering but am not sure how to do this taking into account the trial by trial component.
If, as you say, event_df
indices are in pairs of start and end values, you can create a list of those pairs and then filter the second dataframe using that list:
(I am only using the first two trials for subject 1)
pairs = [tuple(event_df.indices[x:x+2]) for x in range(0, len(event_df.indices), 2)]
print(pairs)
[(7, 14), (28, 33)]
res = pd.concat([player_df.iloc[start:end] for start, end in pairs])
print(res)
# subject trial x_coords y_coords t_coords
# 7 1 1 6 1 748
# 8 1 1 5 2 851
# 9 1 1 6 2 949
# 10 1 1 7 3 1056
# 11 1 1 8 4 1160
# 12 1 1 2 1 102
# 13 1 1 2 2 201
# 14 1 1 3 1 300
# 28 1 2 5 3 499
# 29 1 2 5 3 596
# 30 1 2 4 2 692
# 31 1 2 6 1 798
# 32 1 2 5 2 893
# 33 1 2 6 2 995