Search code examples
rsumdplyr

ignore NA in dplyr row sum


is there an elegant way to handle NA as 0 (na.rm = TRUE) in dplyr?

data <- data.frame(a=c(1,2,3,4), b=c(4,NA,5,6), c=c(7,8,9,NA))

data %>% mutate(sum = a + b + c)

a  b  c sum
1  4  7  12
2 NA  8  NA
3  5  9  17
4  6 NA  NA

but I like to get

a  b  c sum
1  4  7  12
2 NA  8  10
3  5  9  17
4  6 NA  10

even if I know that this is not the desired result in many other cases


Solution

  • You could use this:

    library(dplyr)
    data %>% 
      #rowwise will make sure the sum operation will occur on each row
      rowwise() %>% 
      #then a simple sum(..., na.rm=TRUE) is enough to result in what you need
      mutate(sum = sum(a,b,c, na.rm=TRUE))
    

    Output:

    Source: local data frame [4 x 4]
    Groups: <by row>
    
          a     b     c   sum
      (dbl) (dbl) (dbl) (dbl)
    1     1     4     7    12
    2     2    NA     8    10
    3     3     5     9    17
    4     4     6    NA    10