Search code examples
r

Calculate cumsum() while ignoring NA values


Consider the following named vector x.

( x <- setNames(c(1, 2, 0, NA, 4, NA, NA, 6), letters[1:8]) )
# a  b  c  d  e  f  g  h 
# 1  2  0 NA  4 NA NA  6 

I'd like to calculate the cumulative sum of x while ignoring the NA values. Many R functions have an argument na.rm which removes NA elements prior to calculations. cumsum() is not one of them, which makes this operation a bit tricky.

I can do it this way.

y <- setNames(numeric(length(x)), names(x))
z <- cumsum(na.omit(x))
y[names(y) %in% names(z)] <- z
y[!names(y) %in% names(z)] <- x[is.na(x)]
y
# a  b  c  d  e  f  g  h 
# 1  3  3 NA  7 NA NA 13 

But this seems excessive, and makes a lot of new assignments/copies. I'm sure there's a better way.

What better methods are there to return the cumulative sum while effectively ignoring NA values?


Solution

  • Do you want something like this:

    x2 <- x
    x2[!is.na(x)] <- cumsum(x2[!is.na(x)])
    
    x2
    

    [edit] Alternatively, as suggested by a comment above, you can change NA's to 0's -

    miss <- is.na(x)
    x[miss] <- 0
    cs <- cumsum(x)
    cs[miss] <- NA
    # cs is the requested cumsum