Search code examples
rpurrr

`reduce` in purrr with `+` but ignore missing value in R


How can I use reduce function together with lag to obtain the summation but at the same time ignore missing values?

a <- data.frame( b = c(1, 2, NA, 3, 4, 5))
a %>% mutate(addition = reduce(map(0:1, ~lag(b, .x)) , `+`))

The resulting table

b addition
1 NA
2 3
NA NA
3 NA
4 7
5 9

My desire result is

b addition
1 1
2 3
NA 2
3 3
4 7
5 9

Thus, the desire result is to ignore the NA and do the addition for the current value and its lag values. I simplified the example but in the actual case, I need the rolling sum of 12.

I have tried the following, and the results are the same for the column addition in which it returns 25 for all rows. Why does this not work?

a <- data.frame( b = c(1, 2, NA, 3, 4, 5))
a %>% mutate(addition = reduce(map(0:1, ~lag(b, .x)) , ~sum(.x, .y, na.rm = TRUE)))

Solution

  • I am not sure if reduce is the right function for such calculation. Here are couple of options to achieve the same result.

    1. Replace NA values with 0 and add using +
    library(purrr)
    library(dplyr)
    
    a %>% mutate(addition = replace(b, is.na(b),0) + replace(lag(b), is.na(lag(b)),0))
    
    #   b addition
    #1  1        1
    #2  2        3
    #3 NA        2
    #4  3        3
    #5  4        7
    #6  5        9
    
    1. Use map2 and sum the two values ignoring NA values.
    a %>% mutate(addition = map2(b, lag(b), ~sum(.x, .y, na.rm = TRUE)))
    

    EDIT

    Based on your update if you need to do rolling operations then use rolling functions. For eg - with zoo::rollapplyr

    a %>% 
       mutate(addition = zoo::rollapplyr(b, 2, \(x) sum(x, na.rm = TRUE),
                          fill = first(b)))
    
    #   b addition
    #1  1        1
    #2  2        3
    #3 NA        2
    #4  3        3
    #5  4        7
    #6  5        9