Search code examples
rfunctionif-statementdplyracross

How to apply ifelse function across multiple columns and create new columns in R


I would like to apply an ifelse function across multiple columns of my dataset and create new "rescored" columns. Here is a sample dataset:

data = data.frame(year = "2021",
                  month = sample(x = c(1:12), size = 10, replace = TRUE),
                  C1 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C2 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C3 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C4 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C5 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C6 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C7 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C8 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C9 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE),
                  C10 = sample(x = c('Off', 'Yes'), size = 10, replace = TRUE))

I would like to apply a function like this across all rows that begin with C:

rescored = data %>%
  mutate(T1 = ifelse(C1 == "Off", 1,
                     ifelse(C1 == "Yes", 0, NA)))

My real dataset has 50 or more rows that need this function applied. Is there a simple way to do this? I've tried using variations on "across" in dplyr like below but haven't been successful. I'm sure there is also an "apply" option.

rescored = data %>%
  mutate(across(C1:C50, ifelse(~ .x == "Off", 1,
                               ifelse(~.x == "Yes", 0, NA))))

Solution

  • Simply do this (You have to use twiddle ~ at the beginning of function statement and not before every argument.)

    data %>%
      mutate(across(starts_with('C'), ~ifelse( .x == "Off", 1,
                                   ifelse(.x == "Yes", 0, NA))))
    
       year month C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
    1  2021     1  1  0  0  1  1  0  0  1  1   1
    2  2021    12  1  1  0  0  1  1  1  0  1   0
    3  2021    10  1  0  1  0  0  1  0  0  1   1
    4  2021     3  0  1  1  1  0  1  0  0  0   1
    5  2021    11  1  0  1  1  1  0  1  0  0   0
    6  2021    12  1  0  0  1  1  1  0  0  1   0
    7  2021     4  0  0  0  1  1  0  1  0  1   0
    8  2021     2  0  0  0  1  0  0  0  0  1   0
    9  2021     3  0  0  1  0  0  1  0  0  1   0
    10 2021     9  1  0  0  0  0  0  1  0  0   0
    

    Or perhaps this, if you want to retain original columns

    
    data %>%
      mutate(across(starts_with('C'), ~ifelse( .x == "Off", 1, 0), .names = 'scr_{sub("C", "", .col)}'))
    #>    year month  C1  C2  C3  C4  C5  C6  C7  C8  C9 C10 scr_1 scr_2 scr_3 scr_4
    #> 1  2021     7 Yes Yes Yes Off Yes Off Off Yes Yes Yes     0     0     0     1
    #> 2  2021    11 Off Yes Yes Yes Yes Yes Off Yes Yes Yes     1     0     0     0
    #> 3  2021     1 Yes Yes Off Off Yes Yes Yes Off Yes Yes     0     0     1     1
    #> 4  2021     5 Yes Off Off Yes Yes Yes Yes Off Yes Yes     0     1     1     0
    #> 5  2021     6 Off Off Yes Yes Off Off Off Yes Off Yes     1     1     0     0
    #> 6  2021    12 Yes Yes Yes Off Off Yes Yes Yes Off Yes     0     0     0     1
    #> 7  2021     1 Off Off Off Off Yes Off Off Off Yes Yes     1     1     1     1
    #> 8  2021     1 Yes Yes Yes Off Off Yes Yes Off Off Yes     0     0     0     1
    #> 9  2021     8 Off Yes Off Yes Off Off Yes Yes Yes Yes     1     0     1     0
    #> 10 2021    10 Off Yes Off Yes Yes Off Off Yes Off Off     1     0     1     0
    #>    scr_5 scr_6 scr_7 scr_8 scr_9 scr_10
    #> 1      0     1     1     0     0      0
    #> 2      0     0     1     0     0      0
    #> 3      0     0     0     1     0      0
    #> 4      0     0     0     1     0      0
    #> 5      1     1     1     0     1      0
    #> 6      1     0     0     0     1      0
    #> 7      0     1     1     1     0      0
    #> 8      1     0     0     1     1      0
    #> 9      1     1     0     0     0      0
    #> 10     0     1     1     0     1      1
    

    Created on 2021-05-15 by the reprex package (v2.0.0)