I am looking for a way to aggregate pandas data frame by consecutive same values and perform actions like count or max on this aggregation.
for example, if I would have one column in df:
my_column
0 0
1 0
2 1
3 1
4 1
5 0
6 0
7 0
8 0
9 1
10 1
11 0
the result needs to be:
result
0 2
1 2
2 3
3 3
4 3
5 4
6 4
7 4
8 4
9 2
10 2
11 1
Why: We have two 0 at the beginning, and three 1 next,...
What I need, is similar that this answer but for all elements in the group I need the same value.
The preferred answer would be one that shows this aggregation of the consecutive same element and applies the aggregation function to it. So that I could do even max value:
my_column other_value
0 0 7
1 0 4
2 1 1
3 1 0
4 1 5
5 0 1
6 0 1
7 0 2
8 0 8
9 1 1
10 1 0
11 0 2
and the result would be
result
0 7
1 7
2 5
3 5
4 5
5 8
6 8
7 8
8 8
9 1
10 1
11 2
You can use :
g = df["my_column"].ne(df["my_column"].shift()).cumsum()
out = df.groupby(g)["my_column"].transform("count")
Output :
print(out)
my_column
0 2
1 2
2 3
3 3
4 3
5 4
6 4
7 4
8 4
9 2
10 2
11 1
NB : to get the max, use df.groupby(g)["other_value"].transform("max")
.