Trying to find the cumsum across different types of contracts. Each has a unique stop (i.e. delivery) date with several months of expected delivery leading up to that date. Needing to calculate the cumsum of all expected deliveries before the actual delivery date.
For some reason the cumsum/rollsum function is not working. I have tried both DT and dplyr versions but both have failed.
Here is a simplified data for the problem I am working on.
df <- data.frame(report_year = c(rep(2017,10), rep(2018,10)),
report_month = c(seq(1,5,1), seq(2,6,1), seq(3,7,1), seq(2,6,1)),
delivery_year = c(rep(2017,10), rep(2018,10)),
delivery_month = c(rep(5,5),rep(6,5), rep(7,5), rep(6,5)),
sum = c(rep(seq(100,500,100), 4)),
cumsum = c(rep(c(100,300,600,1000,1500),4)))
The first 5 columns is what I currently have.
I am trying to get the last column (i.e. cumsum)
I am probably doing something wrong. Any help is appreciated.
The question did not specifically define which grouping columns to use so this may have to be modified slightly depending on what you want but this does it without any packages:
df$cumsum <- NULL # remove the result from df shown in question
transform(df, cumsum = ave(sum, delivery_year, delivery_month, FUN = cumsum))
Note that although the above works you may run into some problems using sum
and cumsum
as the column names due to confusion with the functions of the same name so you might want to use Sum and Cumsum, say. For example if you don't null out cumsum as we did above then FUN = cumsum will think that you want to apply the cumsum column which is not a function.