Search code examples
rrenamerecode

Renaming Categorical/Integer Cells into Binary Variables


Renaming is a big factor in the analysis. I have a dataset like the following:

dat1 <- read.table(header=TRUE, text="
                   ID  Age  Align  Weat
                   8645    15-24  A  1
                   6228    15-24  B  1
                   5830    15-24  C  3
                   1844    25-34  B  3
                   4461    35-44  C  4
                   2119    55-64  C  2
                   2115    45-54  A  1
                   ")
dat1
    ID   Age Align Weat
1 8645 15-24     A    1
2 6228 15-24     B    1
3 5830 15-24     C    3
4 1844 25-34     B    3
5 4461 35-44     C    4
6 2119 55-64     C    2
7 2115 45-54     A    1

I want to change column 2 to column 4 into binary variables. My option is:

if in Age column, 15-24=1, otherwise=0
if in Align column, A=1, otherwise=0
if in Weat column, 3=1, otherwise=0

My code is not an easy solution (using plyr function rename). I want an easy to do code for more complex and large data.

library(plyr)
dat1$Age <- revalue(dat1$Age, c("15-24"=1,"25-34"=0,"35-44"=0,"45-54"=0,"55-64"=0))
dat1$Align <- revalue(dat1$Align, c("A"=1,"B"=0,"C"=0))
dat1$Weat <- as.factor(dat1$Weat)
dat1$Weat <- revalue(dat1$Weat, c("3"=1,"1"=0,"2"=0, "4"=0))
dat1
    ID Age Align Weat
1 8645   1     1    0
2 6228   1     0    0
3 5830   1     0    1
4 1844   0     0    1
5 4461   0     0    0
6 2119   0     0    0
7 2115   0     1    0

Solution

  • We can use logical operation to determine if the condition is meet and then use as.integer to convert the value to 1 and 0.

    dat2 <- dat1 %>%
      mutate(Age = as.integer(Age %in% "15-24"),
             Align = as.integer(Align %in% "A"),
             Weat = as.integer(Weat == 3))
    dat2
    #     ID Age Align Weat
    # 1 8645   1     1    0
    # 2 6228   1     0    0
    # 3 5830   1     0    1
    # 4 1844   0     0    1
    # 5 4461   0     0    0
    # 6 2119   0     0    0
    # 7 2115   0     1    0
    

    Use + 0L would also work.

    dat2 <- dat1 %>%
      mutate(Age = Age %in% "15-24" + 0L,
             Align = Align %in% "A" + 0L,
             Weat = (Weat == 3) + 0L)
    dat2
    #     ID Age Align Weat
    # 1 8645   1     1    0
    # 2 6228   1     0    0
    # 3 5830   1     0    1
    # 4 1844   0     0    1
    # 5 4461   0     0    0
    # 6 2119   0     0    0
    # 7 2115   0     1    0