Search code examples
rdplyrtidyversecoalesce

Coalesce two columns with a condition in dplyr


I would like to coalesce two columns to remove all NA but at the same time where both columns have values I would like to keep only the highest value.

Example:

df <- data.frame(A = c(1,0,1,0,1,1,0,0,NA),
                 B = c(0,NA,1,1,NA,1,0,1,1))

   A  B
1  1  0
2  0 NA
3  1  1
4  0  1
5  1 NA
6  1  1
7  0  0
8  0  1
9 NA  1

wanted result

   A  B C
1  1  0 1
2  0 NA 0
3  1  1 1
4  0  1 1
5  1 NA 1
6  1  1 1
7  0  0 0
8  0  1 1
9 NA  1 1

Solution

  • You can calculate rowwise maximum removing NA values:

    matrixStats::rowMaxs(as.matrix(df), na.rm = TRUE)
    #[1] 1 0 1 1 1 1 0 1 1
    

    Or with dplyr :

    library(dplyr)
    
    df %>%
      rowwise() %>%
      mutate(C = max(c_across(), na.rm = TRUE))
    
    #     A     B     C
    #  <dbl> <dbl> <dbl>
    #1     1     0     1
    #2     0    NA     0
    #3     1     1     1
    #4     0     1     1
    #5     1    NA     1
    #6     1     1     1
    #7     0     0     0
    #8     0     1     1
    #9    NA     1     1