Search code examples
rrolling-computationrowsum

Rolling rowsum over existing data frame with NAs in r


Given the data frame:

df1 <- data.frame(Company = c('A','B','C','D','E'),
                  `X1980` = c(NA, 5, 3, 8, 13),
                  `X1981` = c(NA, 12, NA, 11, 29),
                  `X1982` = c(33, NA, NA, 41, 42),
                  `X1983` = c(45, 47, 53, NA, 55))

I would like to create a new data frame where each value is replaced by the sum of the current value and the previous value of the row. NAs should be kept as they are.

This should result in the following data frame:

Company  1980  1981  1982  1983
   A      NA    NA    33    78
   B      5     17    NA    47
   C      3     NA    NA    53
   D      8     19    60    NA
   E      13    42    84    139

Solution

  • Here is a tidyverse approach

    library(dplyr)
    library(tidyr)
    library(purrr)
    
    df1 %>% 
      pivot_longer(matches("\\d{4}$")) %>% 
      group_by(Company) %>% 
      mutate(value = accumulate(value, ~if (is.na(out <- .x + .y)) .y else out)) %>% 
      pivot_wider()
    

    Output

    # A tibble: 5 x 5
    # Groups:   Company [5]
      Company X1980 X1981 X1982 X1983
      <chr>   <dbl> <dbl> <dbl> <dbl>
    1 A          NA    NA    33    78
    2 B           5    17    NA    47
    3 C           3    NA    NA    53
    4 D           8    19    60    NA
    5 E          13    42    84   139