Search code examples
rcumsum

How to create a cumulative sum that is grouped by 2 different columns


Say I have the following dataframe, initialized as df:

ID     date     value
A      01/2012  1
A      03/2012  2
A      05/2012  4
A      07/2012  3
A      09/2012  7
A      11/2012  1
A      01/2013  2
A      03/2013  8
A      05/2013  13
A      07/2013  2
A      09/2013  5
A      11/2013  2
B      01/2012  3
B      03/2012  9
B      05/2012  1
B      07/2012  0
B      09/2012  12
B      11/2012  3
B      01/2013  1
B      03/2013  4
B      05/2013  3
B      07/2013  3
B      09/2013  1
B      11/2013  1

Where the date variable is of the form mm/yyyy. I am looking for a way to create an additional column in this data frame that gives cumulative sum of the value column grouped by ID and year. For example, the following output is what I want:

ID     date     value  cumsum
A      01/2012  1      1
A      03/2012  2      3
A      05/2012  4      7
A      07/2012  3      10
A      09/2012  7      17
A      11/2012  1      18
A      01/2013  2      2
A      03/2013  8      10
A      05/2013  13     23
A      07/2013  2      25
A      09/2013  5      30
A      11/2013  2      32
B      01/2012  3      3
B      03/2012  9      12
B      05/2012  1      13
B      07/2012  0      13
B      09/2012  12     25
B      11/2012  3      28
B      01/2013  1      1
B      03/2013  4      5
B      05/2013  3      8
B      07/2013  3      11
B      09/2013  1      12
B      11/2013  1      13

As you can see, the sum resets for each year but also for each ID. Essentially I am unsure of how to create a cumulative sum that is grouped by 2 columns instead of just 1. Any help would be appreciated


Solution

  • For completion here is a base R option -

    transform(df, cumsum = ave(value, ID, sub('.*/', '', date), FUN = cumsum))
    
    #   ID    date value cumsum
    #1   A 01/2012     1      1
    #2   A 03/2012     2      3
    #3   A 05/2012     4      7
    #4   A 07/2012     3     10
    #5   A 09/2012     7     17
    #6   A 11/2012     1     18
    #7   A 01/2013     2      2
    #8   A 03/2013     8     10
    #9   A 05/2013    13     23
    #10  A 07/2013     2     25
    #11  A 09/2013     5     30
    #12  A 11/2013     2     32
    #13  B 01/2012     3      3
    #14  B 03/2012     9     12
    #15  B 05/2012     1     13
    #16  B 07/2012     0     13
    #17  B 09/2012    12     25
    #18  B 11/2012     3     28
    #19  B 01/2013     1      1
    #20  B 03/2013     4      5
    #21  B 05/2013     3      8
    #22  B 07/2013     3     11
    #23  B 09/2013     1     12
    #24  B 11/2013     1     13