I have the following dataset:
Time = ['00:01', '00:02','00:03','00:01','00:02','00:03','00:01','00:02','00:03']
ID = [1, 1, 1, 2, 2, 2, 3, 3, 3]
Value = [3.5, 3.5, 3.5, 4.1, 4.1, 4.1, 2.3, 2.3, 2.3]
df = pd.DataFrame({'Time':Time, 'ID':ID, 'Value':Value})
Each value is the same for each ID. I want to create a new column that adds up Value column accumulatively, but only when each ID changes. V
So instead of getting
3.5 7 10.5 14.6 18.7 22.8 25.1 27.3 29.5
I want
3.5 3.5 3.5 7.6 7.6 7.6 9.9 9.9 9.9
using .loc
to assign your value,
shift
to test where ID changes
and then
cumsum
with ffill
df.loc[:, "Val"] = df[df["ID"].ne(df["ID"].shift())][
"Value"
].cumsum()
df['Val'] = df['Val'].ffill()
print(df)
Time ID Value Val
0 00:01 1 3.5 3.5
1 00:02 1 3.5 3.5
2 00:03 1 3.5 3.5
3 00:01 2 4.1 7.6
4 00:02 2 4.1 7.6
5 00:03 2 4.1 7.6
6 00:01 3 2.3 9.9
7 00:02 3 2.3 9.9
8 00:03 3 2.3 9.9
or more simply as suggested by Ch3steR
df['Value'].where(df['Value'].ne(df['Value'].shift(1))).cumsum().ffill()
0 3.5
1 3.5
2 3.5
3 7.6
4 7.6
5 7.6
6 9.9
7 9.9
8 9.9