Search code examples
rdataframefunctionrecode

debug created function using case_when in R


From the original dataframe:

  id  tru  wex  kjd
1 101    2    2    2
2 106    2    2 <NA>
3 107    0    0    0
4 110 <NA>    2    0
5 115    2    2    2
6 118    0 <NA>    0

Would like to get this dataframe:

  id  tru  wex  kjd tru_new wex_new kjd_new
1 101    2    2    2    GOOD    GOOD    GOOD
2 106    2    2 <NA>    GOOD    GOOD    <NA>
3 107    0    0    0     BAD     BAD     BAD
4 110 <NA>    2    0    <NA>    GOOD     BAD
5 115    2    2    2    GOOD    GOOD    GOOD
6 118    0 <NA>    0     BAD    <NA>     BAD

Here's my code that works, but is repetitive:

id <- c("101", "106", "107", "110", "115", "118")
tru <- c("2", "2", "0", NA, "2", "0")
wex <- c("2", "2", "0", "2", "2", NA)
kjd <- c("2", NA, "0", "0", "2", "0")
dfname <- data.frame(id, tru, wex, kjd)

library(dplyr)
dfname <- dfname %>%  
      mutate(tru_new = case_when(!is.na(tru) & tru=="2" ~ "GOOD",
      !is.na(tru) & tru=="0" ~ "BAD",
      is.na(tru) ~ NA_character_))

dfname <- dfname %>%  
      mutate(wex_new = case_when(!is.na(wex) & wex=="2" ~ "GOOD",
      !is.na(wex) & wex=="0" ~ "BAD",
      is.na(wex) ~ NA_character_))

dfname <- dfname %>%  
      mutate(kjd_new = case_when(!is.na(kjd) & kjd=="2" ~ "GOOD",
      !is.na(kjd) & kjd=="0" ~ "BAD",
      is.na(kjd) ~ NA_character_))

But I want to get my own function working to code more efficiently. Please, if help can be in this style I'd appreciate it since I'm learning to create functions and use case_when.

My function attempt is here. Can you help me get this new function working? There is probably a minor problem that I just can't seem to fix.

library(dplyr)
new_cats <- function(var_old, var_new, df) {
  df <- df %>%  
      mutate(var_new = case_when(!is.na(var_old) & var_old=="2" ~ "GOOD",
      !is.na(var_old) & var_old=="0" ~ "BAD",
      is.na(var_old) ~ NA_character_))
  return(df)
}

dfname <- new_cats(tru, tru_new, dfname)
dfname <- new_cats(wex, wex_new, dfname)
dfname <- new_cats(kjd, kjd_new, dfname)

Wrong result from my created function:

  id  tru  wex  kjd var_new
1 101    2    2    2    GOOD
2 106    2    2 <NA>    <NA>
3 107    0    0    0     BAD
4 110 <NA>    2    0     BAD
5 115    2    2    2    GOOD
6 118    0 <NA>    0     BAD

Solution

  • You can do it with across and ifelse:

    library(dplyr)
    dfname %>% 
      mutate(across(-id, ~ ifelse(.x == "2", "GOOD", "BAD"), 
                    .names = "{col}_new"))
    
    #    id  tru  wex  kjd tru_new wex_new kjd_new
    # 1 101    2    2    2    GOOD    GOOD    GOOD
    # 2 106    2    2 <NA>    GOOD    GOOD    <NA>
    # 3 107    0    0    0     BAD     BAD     BAD
    # 4 110 <NA>    2    0    <NA>    GOOD     BAD
    # 5 115    2    2    2    GOOD    GOOD    GOOD
    # 6 118    0 <NA>    0     BAD    <NA>     BAD
    

    Or with case_when and across:

    dfname %>% 
      mutate(across(-id, 
                    ~ case_when(.x == "2" ~ "GOOD",
                                .x != "2" ~ "BAD"), 
                    .names = "{col}_new"))
    

    Your function does not work because you are not creating a new name for each new columns. You have to put var_new under {{ to make it work. I also simplified your code:

    new_cats <- function(var_old, var_new, df) {
      df %>%  
        mutate({{var_new}} := case_when({{var_old}} == "2" ~ "GOOD",
                                        {{var_old}} != "2" ~ "BAD"))
    }
    
    dfname <- new_cats(tru, tru_new, dfname)
    dfname <- new_cats(wex, wex_new, dfname)
    dfname <- new_cats(kjd, kjd_new, dfname)
    
    #    id  tru  wex  kjd tru_new wex_new kjd_new
    # 1 101    2    2    2    GOOD    GOOD    GOOD
    # 2 106    2    2 <NA>    GOOD    GOOD    <NA>
    # 3 107    0    0    0     BAD     BAD     BAD
    # 4 110 <NA>    2    0    <NA>    GOOD     BAD
    # 5 115    2    2    2    GOOD    GOOD    GOOD
    # 6 118    0 <NA>    0     BAD    <NA>     BAD