Search code examples
rdplyrmultiple-conditionsmutate

Mutate in place and if_any with condition over multiple columns


I am using R dplyr and trying to mutate multiple columns in place. These columns are var1-var3, and they currently contain values of either 1 or NA. I want to apply the following logic: if a given row in var1-var3 has at least one value = 1, then any NA values for that row will be transformed to zero. If a row has NA across all columns (var1-var3), then the values will remain NA. I am trying this:

df <- data.frame("var1" = c(1,NA,1,NA,NA), 
              "var2" = c(NA,NA,NA,1,1),
              "var3" = c(1,NA,NA,1,NA),
              "age" = c(25,41,39,60,36) ,
              "satisfaction" = c(5,3,2,5,4)
              )

#  Output
#  var1 var2 var3 age sat
# 1    1   NA    1  25  5
# 2   NA   NA   NA  41  3
# 3    1   NA   NA  39  2
# 4   NA    1    1  60  5
# 5   NA    1   NA  36  4

df <- df %>% 
mutate_at(vars(contains('var')), ~ case_when(if_any(.x, `==`, 1), 
ifelse(is.na(.x), 0, .x ), .x) #replace NA with 0
)

Solution

  • One solution which first checks if all the values across your chosen columns are NA, and then applies case_when() in combination with replace_na() from {tidyr}:

    library(dplyr)
    df <- df |> 
      mutate(all_na = rowSums(!is.na(across(contains('var'))))) |> 
      mutate(across(contains('var'), ~case_when(
        all_na != 0 ~ tidyr::replace_na(.x, 0),
        all_na == 0 ~ .x
      ))) |> 
      select(-all_na)
    

    returns:

      var1 var2 var3 age satisfaction
    1    1    0    1  25            5
    2   NA   NA   NA  41            3
    3    1    0    0  39            2
    4    0    1    1  60            5
    5    0    1    0  36            4