I'm trying to find a cumulative sum for two separate groups, and to have each of these sums listed in separate columns.
Here is the data frame, sorted based on time:
time group value
0 A 0
0 B 0
0 A 0
1 A 0
1 B 1
1 B 0
2 B 1
2 A 1
2 A 1
2 A -1
3 A 0
3 B 1
This is what I have to find cumsum by group, and to create the cumsum column:
df$cumsum <- ave(df$value, df$group, FUN=cumsum)
time group value cumsum
0 A 0 0
0 B 0 0
0 A 0 0
1 A 0 0
1 B 1 1
1 B 0 1
2 B 1 2
2 A 1 1
2 A 1 2
2 A -1 1
3 A 0 1
3 B 1 3
How is it possible to separate the results into two columns, one for A and one for B? Alternatively, would it be possible to find a conditional cumsum? Either way, I want the results to look like this:
time group value cumsum_A cumsum_B
0 A 0 0 0
0 B 0 0 0
0 A 0 0 0
1 A 0 0 0
1 B 1 0 1
1 B 0 0 1
2 B 1 0 2
2 A 1 1 2
2 A 1 2 2
2 A -1 1 2
3 A 0 1 2
3 B 1 1 3
Thanks!
You could first find out the unique
values and using sapply
/lapply
loop over them to conditionally calculate cumsum
for each one of them.
unique_val <- unique(df$group)
df[paste0("cumsum_", unique_val)] <- lapply(unique_val,
function(x) cumsum((df$group == x) * df$value))
df
# time group value cumsum_A cumsum_B
#1 0 A 0 0 0
#2 0 B 0 0 0
#3 0 A 0 0 0
#4 1 A 0 0 0
#5 1 B 1 0 1
#6 1 B 0 0 1
#7 2 B 1 0 2
#8 2 A 1 1 2
#9 2 A 1 2 2
#10 2 A -1 1 2
#11 3 A 0 1 2
#12 3 B 1 1 3