Search code examples
rdplyrforcats

Recode variable in multiples columns based on condition using function inside mutate_all


I've this data :

    # A tibble: 169 x 5
   `Topical Anesthesist`             `Skin Glue`                   `Sedation Servi~ `Child Life Ther~ Hypnosis     
   <chr>                             <chr>                         <chr>            <chr>             <chr>        
 1 No, because we do not see/treat ~ No, because we do not see/tr~ No               No                No           
 2 No, we do not have one available  Yes                           No               No                No           
 3 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                No           
 4 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                No           
 5 Yes: LET/LAT (lidocaine, epineph~ Yes                           No               No                No           
 6 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                Yes: during ~
 7 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                Yes: during ~
 8 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                Yes: during ~
 9 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                Yes: during ~
10 Yes: LET/LAT (lidocaine, epineph~ Yes                           Yes              No                No   

Depending on the columns, i've values with "yes", following by various characters chains ("yes, all day", "yes: during the whole day", etc ...). All my values begin by yes or no. I want to replace every value beginning with yes by the word "Checked", and every value beginning with "No" with "Unchecked" ("Checked" and "Unchecked" are other values i use in the rest of my dataset and my code use them)

I try that :

data %>%
  mutate_all(.funs = fct_recode,
             "Checked" = starts_with("yes"),
             "Unchecked" = starts_with("no"))

I obtain this error :

starts_with()` must be used within a selecting function.

I don't know how to simply solve my problem...

Thanks for your help !


Solution

  • The following code takes the starwars dataset, finds every entry that begins with "bl" and replaces it with "HA":

    library(dplyr)
    data(starwars)
    func = function(x){
      ifelse(grepl("^bl",x), "HA", x)
    }
    
    mutate_all(starwars, func)
    

    Following this pattern I suggest you try something like:

    func = function(x){
      ifelse(grepl("^Yes",x), "Checked", ifelse(grepl("^No",x), "Unchecked", x))
    }
    data %>% mutate_all(func)
    

    Or more elegantly using case_when:

    func = function(x){
      case_when(grepl("^Yes", x) ~"Checked",
                grepl("^No", x) ~"Unchecked",
                TRUE ~as.character(x))
    }
    
    data %>% mutate_all(func)