Search code examples
rdplyrrecode

How to use mutate_all and recode together properly using dplyr?


I have been trying to use the dplyr variant of recode, combined with mutate_all on all variables in a dataset, but it does not yield the expected output. Other answers that I found does not address this problem (e.g. Recode and Mutate_all in dplyr)

Here is what I tried:

library(tidyverse)
library(car)

# Create sample data
df <- data_frame(a = c("Yes","Maybe","No","Yes"), b = c("No","Maybe","Yes","Yes"))

# Using dplyr::recode
df %>% mutate_all(funs(recode(., `1` = "Yes", `0` = "No", `NA` = "Maybe")))

No effect on values:

# A tibble: 4 × 2
      a     b
  <chr> <chr>
1   Yes    No
2 Maybe Maybe
3    No   Yes
4   Yes   Yes

What I want can be reproduced using car::Recode:

# Using car::Recode
df %>% mutate_all(funs(Recode(., "'Yes' = 1; 'No' = 0; 'Maybe' = NA")))

This is the desired outcome:

# A tibble: 4 × 2
      a     b
  <dbl> <dbl>
1     1     0
2    NA    NA
3     0     1
4     1     1

Solution

  • You inverted the 'key/values' in dplyr::recode. This works for me:

    df %>% mutate_all(funs(recode(., Yes = 1L, No = 0L, Maybe = NA_integer_)))
    
    # A tibble: 4 × 2
          a     b
      <dbl> <dbl>
    1     1     0
    2    NA    NA
    3     0     1
    4     1     1
    

    Note that it throws an error if you don't specify the type of NA.

    Also you can use quoted or unquoted value (e.g.: Both Yes or 'Yes' work)