Search code examples
rvariablesrecode

Creating multiple new variables based on existing ones and recode them at the same time R


If I wanted to create new vars from a pre-existing range and change their values, without having to do them all individually, what would be the best approach?

For example, here I create 1C based on 1, but recode the 2s to zeroes etc. However, how would I create 100 new vars at the same time (calling them 1C, 2C, 3C etc) based on the same logic?

df$`1C`[df$`1`==1]<-1
df$`1C`[df$`1`==2]<-0
df$`1C`[df$`1`==0]<-0

Solution

  • We can use dplyr::across and dplyr::recode:

    Imagine we had the following data:

    set.seed(123)
    df <- setNames(data.frame(1:5,matrix(sample(0:2,25,replace = TRUE),nrow = 5)),c("ID",1:5))
    df
      ID 1 2 3 4 5
    1  1 2 0 0 0 2
    2  2 0 2 0 1 0
    3  3 0 2 1 0 1
    4  4 2 1 2 0 0
    5  5 1 0 0 2 0
    

    We can use Tidyselect with : to specify the columns. dplyr::recode takes a ... argument that lists the <have> = <want> sets of things to recode. We can use the .names = argument to specify how we want the names of the new columns to appear.

    library(dplyr)
    df %>%
       mutate(across(`1`:`5`, ~recode(.,`0` = 0, `1` = 1, `2` = 0),
                     .names = "{.col}C"))
      ID 1 2 3 4 5 1C 2C 3C 4C 5C
    1  1 2 1 1 0 0  0  1  1  0  0
    2  2 2 1 1 2 0  0  1  1  0  0
    3  3 2 1 0 2 2  0  1  0  0  0
    4  4 1 2 1 0 1  1  0  1  0  1
    5  5 2 0 2 0 2  0  0  0  0  0