Search code examples
rdataframedplyrcasemutate

mutate_if multiple conditions along with case_when in R


I want to apply the case_when function over those columns who are factors and have only 2 levels and whose names do not contain "kiwi".

This does not work. I can get what I want with other longer approaches but I was wondering if there is any way to do this is an efficient way.

librare(dplyr)
library(stringr)

df <- data.frame(orange = factor(c(1,1,0,1,0)),
                 apple = factor(c("x", "x", "y", "y", "x")),
                 peach = 1:5,
                 kiwi = factor(c("a", "a", "b", "b", "c")))

df2 <- df %>%
    mutate_if((~ is.factor(.) & nlevels(.) == 2 & str_detect(names(.), "kiwi", negate = TRUE)),
                     ~ dplyr::case_when(.x == 0, "No",
                                        .x == 1 ~ "Yes",
                                        TRUE ~ .x))

enter image description here


Solution

  • From Operate on a selection of variables, it said

    Scoped verbs (_if, _at, _all) have been superseded by the use of pick() or across() in an existing verb. See vignette("colwise") for details.

    With across(), you can use the <tidy-select> syntax to keep or drop columns:

    • columns which are factors: where(is.factor)
    • have only 2 levels: where(~ nlevels(.x) == 2)
    • columns whose names do not contain "kiwi": !contains("kiwi")
    df %>%
      mutate(across(where(~ is.factor(.x) & nlevels(.x) == 2) & !contains("kiwi"),
                    ~ case_when(.x == 0 ~ "No",
                                .x == 1 ~ "Yes",
                                TRUE ~ .x)))