Within a dataframe, I need to count and sum consecutive row values in column A
into a new column, column B
.
Starting with column A
, the script would count the consecutive runs in 1s
but when a 0
appears it prints the total count in column B
, it then resets the count and continues through the remaining data.
A | B
0 0
1 0
1 0
1 0
1 0
0 4
0 0
1 0
1 0
0 2
I've tried using .shift()
along with various if-statements
but have been unsuccessful.
This could be a way to do it. Probably there exists a more elegant solution.
df['B'] = df['A'].groupby(df['A'].ne(df['A'].shift()).cumsum()).cumsum().shift(fill_value=0) * (df['A'].diff() == -1)
This part df['A'].groupby(df['A'].ne(df['A'].shift())
groups the data by consecutive occurences of values.
Then we take the cumsum which counts the cumulated sum along each group. Then we shift the results by 1 row because you want the count after the group. Then we mask out all the rows which are not the last row of the group + 1.