I have a df that looks like this: df here
The column Count gives values that are either 0 or stretches of higher numbers separated by zeros e.g. 0,0,0,0,4,4,4,4,4,4,0,0,0,0,0,0,6,6,6,6,6,6,6,0,0,0,0.
I would like to sum the values of the column "angle_1frame_abs" for each stretch and add it as new column to the df e.g.
Count: 0,0,0,0,4,4,4,4,4,4,...
angle_1frame_abs: 2,1,3,4,2,2,2,3,4,3,...
new column: 10,10,10,10,16,16,16,16,16,16,...
Thank you all!
Seems what you need is to divide your df into segments with the same consecutive value of Count
, and sum over the angle_1frame_abs
within each segment, and copy the sum of each segment to each row of the segment.
In your case we cannot use Count
as the key for groupby
, so we need a new one:
new_key = (df['Count'].diff().abs() > 0).cumsum()
Using the new key, we can do the sum:
the_sum = df.groupby( new_key )['angle_1frame_abs'].sum()
And copy the summation results back to the table by the new key
df['NewColumn'] = new_key.map(the_sum)
Note that we use diff()
to find out the boundary of segments, and used abs() > 0
to convert boundary values to True/False
or 1/0
, then use cumsum
to get the new key by leveraging the fact that the cumsum doesn't change outside the boundary (i.e. within the segment).
I tested my code using the following fake data
df = pd.DataFrame({
'Count': [2,0,0,0,4,4,4,4,4,4,0,0,0,0,0,0,6,6,6,6,6,6,6,0,0,0,1],
'angle_1frame_abs': [np.nan,0,0,0,4,4,4,4,4,4,1,1,1,1,1,1,6,6,6,6,6,6,6,2,2,2,2],
})