Search code examples
rdata-cleaningrecode

Recode observation in column depending on different column


I have a dataset called 'survey' with rows of individual ID's, and columns with many questions. I need to recode the value in 1 column as NA and move the observation to the other column.

For example:

ID    Food    Vegetable 
aaa   NA       NA 
bbb   NA       lemon
ccc   NA       sprout
ddd   fruit    NA
eee   fruit    NA
fff   NA       watermelon

I want to change the lemon and watermelon observations, belonging to ID bbb and fff to put them into the Food column and rename them fruit (survey respondents put them in the wrong column) and leave NA behind in the vegetable column.

To look like:

   ID    Food        Vegetable 
    aaa   NA         NA 
    bbb   fruit      NA
    ccc   NA         sprout
    ddd   fruit      NA
    eee   fruit      NA
    fff   fruit      NA       

I've used:

survey<- survey %>%
    mutate(food = if_else(str_detect(Vegetable,"(lemon)|(watermelon)"),"fruit", Food))
 

Which works to convert NA to fruit in the food column, but it doesn't concert to NA in the vegetable column, it also turns all the other fruits in the food column to NA!

DATA:

    structure(list(ID = c("aaa", "bbb", "ccc", "ddd", "eee", "fff"
), Food = c(NA, NA, NA, "fruit", "fruit", NA), Vegetable = c(NA, 
"lemon", "sprout", NA, NA, "watermelon")), class = "data.frame", row.names = c(NA, 
-6L))

P.S.: This is a follow up to a previous question I asked which was answered. This isn't exactly the same question as before, which is why I started a new one.

dplyr version (1.0.2)


Solution

  • One option is to update Food and Vegetable based on whether Vegetable values are %in% a given list, not_vegetables:

    not_vegetables <- c("grape", "tomato")
    
    df %>%
      mutate(Food = if_else(Vegetable %in% not_vegetables, "fruit", Food),
             Vegetable = if_else(Vegetable %in% not_vegetables, NA_character_, Vegetable))
    

    Another way is to replace, across both columns, and do the if_else inside:

    df %>%
      mutate(across(
        c(Food, Vegetable), 
        ~replace(., 
                 Vegetable %in% not_vegetables, 
                 if_else(cur_column() == "Food", 'fruit', NA_character_))
        ))