Search code examples
rdplyrcase-when

How to use case_when with mutate_all to insert variable value


I have a seemingly small problem. I want to use mutate_all() in conjunction with case_when(). A sample data frame:

tbl <- tibble( 
  x = c(0, 1, 2, 3, NA),
  y = c(0, 1, NA, 2, 3),
  z = c(0, NA, 1, 2, 3),
  date = rep(today(), 5)
)

I first made another data frame replacing all the NA's with zero's and the values with a 1 with the following piece of code.

tbl %>%
 mutate_all(
    funs(
      case_when(
        . %>% is.na() ~ 0,
        TRUE ~ 1
      )))

Now I want to replace the NA values with blanks ("") and leave the other values as it is. However, I don't know how to set the TRUE value in a way that it keeps the value of the column.

Any suggestions would be much appreciated!


Solution

  • To leave the NA as "", we can use replace_na from tidyr

    library(dplyr)
    library(tidyr)
    tbl %>%
         mutate_all(replace_na, "")
    # A tibble: 5 x 3
    #  x     y     z    
    #  <chr> <chr> <chr>
    #1 0     0     0    
    #2 1     1     ""   
    #3 2     ""    1    
    #4 3     2     2    
    #5 ""    3     3    
    

    With case_when or if_else, we have to make sure the type are the same across. Here, we are converting to character when we insert the "", so make sure the other values are also coerced to character class

    tbl %>%
       mutate_all(~ case_when(is.na(.) ~ "", TRUE ~ as.character(.)))
    

    If we want to use only specific columns, then we can use mutate_at

    tbl %>%
       mutate_at(vars(x:y), ~ case_when(is.na(.) ~ "", TRUE ~ as.character(.)))
    

    Also, to simplify the code in OP's post, it can be directly coerced to integer with as.integer or +

    tbl %>% 
         mutate_all(~ as.integer(!is.na(.)))
    

    Or if we are using case_when

    tbl %>%
           mutate_all(~ case_when(is.na(.)~ 0, TRUE ~ 1))