Search code examples
rcountnaadditionsubtraction

Summarize and subtract with NA


can anyone help?

I have a dataset

x <- data.frame(A = c(NA, '1', '0', '0'),
            B = c('0', '0', '0', NA),
            C = c('1', NA, NA, NA))

I need to generate something like this (generate the two variables x5 & x6):

 _ x1 x2 x3 x4  x5 x6
A  NA 1  0  0   1  2
B  0  0  0  NA  0  3
C  1  NA NA NA  1  0

Thanks


Solution

  • I noticed the comment that you were still working on this. The answer by @ThomasIsCoding works just fine, but just in case, here's an alternative, step-by-step approach you could also consider.

    First, transpose your data frame (we'll call df):

    df <- as.data.frame(t(x))
    df
    
        V1   V2   V3   V4
    A <NA>    1    0    0
    B    0    0    0 <NA>
    C    1 <NA> <NA> <NA>
    

    Now for the 2 additional columns, use rowSums to sum up the values of "1" and "0". You need na.rm = TRUE given presence of NA in your data. The value 1:4 represents the first four columns.

    df$V5 <- rowSums(df[,1:4] == "1", na.rm = T)
    df$V6 <- rowSums(df[,1:4] == "0", na.rm = T)
    df
    

    Output

        V1   V2   V3   V4 V5 V6
    A <NA>    1    0    0  1  2
    B    0    0    0 <NA>  0  3
    C    1 <NA> <NA> <NA>  1  0