Search code examples
rdataframesumrestartcounting

R - How to use cumulative sum by year and restart cumulative sum when condition is met


I have the following data frame in R:

  YEAR DOY   PRECTOT cumsum Lws   prec0
   <int> <chr>   <dbl>  <dbl> <chr> <chr>
 1  1982 121    6.05     6.05 no    no   
 2  1982 122    1.10     7.15 no    no   
 3  1982 123    0.490    7.64 no    no   
 4  1982 124    4.53    12.2  no    no   
 5  1982 125    3.94    16.1  no    no   
 6  1982 126    2.78    18.9  no    no   
 7  1982 127    0.420   19.3  no    no   
 8  1982 128    0.      19.3  no    yes  
 9  1982 129    0.0700  19.4  no    no   
10  1982 130    8.94    28.3  no    no 

I want another column that calculates the cumulative sum like in the cumsum column but then restarts counting when PRECTOT is 0, such as in row 8. Basically it should restart the cumulative sum from row 8 and the continue the cumulative sum from there, as such:

  YEAR DOY   PRECTOT cumsum Lws   prec0
   <int> <chr>   <dbl>  <dbl> <chr> <chr>
 1  1982 121    6.05     6.05 no    no   
 2  1982 122    1.10     7.15 no    no   
 3  1982 123    0.490    7.64 no    no   
 4  1982 124    4.53    12.2  no    no   
 5  1982 125    3.94    16.1  no    no   
 6  1982 126    2.78    18.9  no    no   
 7  1982 127    0.420   19.3  no    no   
 8  1982 128    0.      0  no    yes  
 9  1982 129    0.0700  0.0700  no    no   

Is there a nice and efficient way to this in R? Thank you.


Solution

  • The "restart when condition is met" part is done with a group_by(cumsum(<condition>)):

    library(dplyr)
    
    dat %>% 
      group_by(grp = cumsum(PRECTOT == 0)) %>% 
      mutate(cumsum = cumsum(PRECTOT))
    
    # # A tibble: 10 x 7
    # # Groups:   grp [2]
    #     YEAR DOY   PRECTOT cumsum Lws   prec0   grp
    #    <int> <chr>   <dbl>  <dbl> <chr> <chr> <int>
    #  1  1982 121      6.05   6.05 no    no        0
    #  2  1982 122      1.1    7.15 no    no        0
    #  3  1982 123      0.49   7.64 no    no        0
    #  4  1982 124      4.53  12.2  no    no        0
    #  5  1982 125      3.94  16.1  no    no        0
    #  6  1982 126      2.78  18.9  no    no        0
    #  7  1982 127      0.42  19.3  no    no        0
    #  8  1982 128      0      0    no    yes       1
    #  9  1982 129      0.07   0.07 no    no        1
    # 10  1982 130      8.94   9.01 no    no        1
    

    Data:

    dat <- readr::read_table2(
    "YEAR DOY   PRECTOT cumsum Lws   prec0
    1982 121    6.05     6.05 no    no
    1982 122    1.10     7.15 no    no
    1982 123    0.490    7.64 no    no
    1982 124    4.53    12.2  no    no
    1982 125    3.94    16.1  no    no
    1982 126    2.78    18.9  no    no
    1982 127    0.420   19.3  no    no
    1982 128    0.      19.3  no    yes
    1982 129    0.0700  19.4  no    no
    1982 130    8.94    28.3  no    no
    ", col_types = "icddcc")