Search code examples
rvectorsumlapplycbind

Is it possible to skip NA values in "+" operator?


I want to calculate an equation in R. I don't want to use the function sum because it's returning 1 value. I want the full vector of values.

x = 1:10
y = c(21:29,NA)
x+y
 [1] 22 24 26 28 30 32 34 36 38 NA

x = 1:10
y = c(21:30)
x+y
 [1] 22 24 26 28 30 32 34 36 38 40

I don't want:

sum(x,y, na.rm = TRUE)
[1] 280

Which does not return a vector.

This is a toy example but I have a more complex equation using multiple vector of length 84647 elements.

Here is another example of what I mean:

x = 1:10
y = c(21:29,NA)
z = 11:20
a = c(NA,NA,NA,30:36)
5 +2*(x+y-50)/(x+y+z+a) 
 [1]       NA       NA       NA 4.388889 4.473684 4.550000 4.619048 4.681818 4.739130       NA

Solution

  • 1) %+% Define a custom + operator:

    `%+%` <- function(x, y)  mapply(sum, x, y, MoreArgs = list(na.rm = TRUE))
    5 + 2 * (x %+% y - 50) / (x %+% y %+% z %+% a)
    

    giving:

    [1] 3.303030 3.555556 3.769231 4.388889 4.473684 4.550000 4.619048 4.681818
    [9] 4.739130 3.787879
    

    Here are some simple examples:

    1 %+% 2
    ## [1] 3
    
    NA %+% 2
    ## [1] 2
    
    2 %+% NA
    ## [1] 2
    
    NA %+% NA
    ## [1] 0
    

    2) na2zero Another possibility is to define a function which maps NA to 0 like this:

    na2zero <- function(x) ifelse(is.na(x), 0, x)
    
    X <- na2zero(x)
    Y <- na2zero(y)
    Z <- na2zero(z)
    A <- na2zero(a)
    
    5 + 2 * (X + Y - 50) / (X + Y + Z + A)
    

    giving:

    [1] 3.303030 3.555556 3.769231 4.388889 4.473684 4.550000 4.619048 4.681818
    [9] 4.739130 3.787879
    

    3) combine above A variation combining (1) with the idea in (2) is:

    X <- x %+% 0
    Y <- y %+% 0
    Z <- z %+% 0
    A <- a %+% 0
    
    5 + 2 * (X + Y - 50) / (X + Y + Z + A)
    

    4) numeric0 class We can define a custom class "numeric0" with its own + operator:

    as.numeric0 <- function(x) structure(x, class = "numeric0")
    `+.numeric0` <- `%+%`
    
    X <- as.numeric0(x)
    Y <- as.numeric0(y)
    Z <- as.numeric0(z)
    A <- as.numeric0(a)
    
    5 + 2 * (X + Y - 50) / (X + Y + Z + A)
    

    Note: The inputs used were those in the question, namely:

    x = 1:10
    y = c(21:29,NA)
    z = 11:20
    a = c(NA,NA,NA,30:36)