Search code examples
rdataframesumna

Ignore NA in vector sum


I want to know if is there a way to handle NA values when I try to sum some columns from a data frame?

This is a simulated example of the data I am working with:

id<-rep(1:4,each=8)
v1<-c(1,2,5,4,58,6,4,9)
v2<-c(78,85,56,47,12,3,65,98)
v3<-c(101,NA,452,NA,NA,45,7,56)
data<-data.frame(id,v1,v2,v3)
data
  id v1 v2  v3
1  1  1 78 101
2  1  2 85  NA
3  2  5 56 452
4  2  4 47  NA
5  3 58 12  NA
6  3  6  3  45
7  4  4 65   7
8  4  9 98  56

I wanto apply this formula using v1,v2,v3:

data$cat<-v1*0.05+v2*0.05+v3*0.05

This is the result I get when I use the sum:

data
  id v1 v2  v3   cat
1  1  1 78 101  9.00
2  1  2 85  NA    NA
3  2  5 56 452 25.65
4  2  4 47  NA    NA
5  3 58 12  NA    NA
6  3  6  3  45  2.70
7  4  4 65   7  3.80
8  4  9 98  56  8.15

v1,v2 and v3 are numeric vectors


Solution

  • You can try rowSums with na.rm = TRUE (as @akrun said in the comment) like below

    data$cat <- rowSums(data[-1] * c(0.05, 0.05, 0.05)[col(data[-1])], na.rm = TRUE)
    

    which gives

    > data
       id v1 v2  v3   cat
    1   1  1 78 101  9.00
    2   1  2 85  NA  4.35
    3   1  5 56 452 25.65
    4   1  4 47  NA  2.55
    5   1 58 12  NA  3.50
    6   1  6  3  45  2.70
    7   1  4 65   7  3.80
    8   1  9 98  56  8.15
    9   2  1 78 101  9.00
    10  2  2 85  NA  4.35
    11  2  5 56 452 25.65
    12  2  4 47  NA  2.55
    13  2 58 12  NA  3.50
    14  2  6  3  45  2.70
    15  2  4 65   7  3.80
    16  2  9 98  56  8.15
    17  3  1 78 101  9.00
    18  3  2 85  NA  4.35
    19  3  5 56 452 25.65
    20  3  4 47  NA  2.55
    21  3 58 12  NA  3.50
    22  3  6  3  45  2.70
    23  3  4 65   7  3.80
    24  3  9 98  56  8.15
    25  4  1 78 101  9.00
    26  4  2 85  NA  4.35
    27  4  5 56 452 25.65
    28  4  4 47  NA  2.55
    29  4 58 12  NA  3.50
    30  4  6  3  45  2.70
    31  4  4 65   7  3.80
    32  4  9 98  56  8.15