Search code examples
rdataframedplyrswitch-statementrecode

How to recode a variable based on the value of another variable?


Let's say I have this data frame.

df <- data.frame(record = c("1", "2", "3", "4", "5", "6"),
                 fruit = c("apple", "orange", "other", "apple", "orange", "other"),
                 specify = c("", "", "red delicious", "", "", "navel"))

How do I recode the values of "other" in the fruit column to be "apple," or "orange" based on the values of the specify column? The red delicious record, 3, should be an apple, and the navel record, 6, should be orange in the fruit column, respectively.

Thank you!


Solution

  • With dplyr version >= 1.1.0, we can use case_match.

    library(dplyr)
    
    df %>% mutate(fruit = case_match(specify, 
                                     "red delicious" ~ "apple", 
                                     "navel" ~ "orange", 
                                     .default = fruit))
    
      record  fruit       specify
    1      1  apple              
    2      2 orange              
    3      3  apple red delicious
    4      4  apple              
    5      5 orange              
    6      6 orange         navel