Search code examples
rdummy-variable

R Dummy-variable to be populated from multiple columns


I am a beginner in R and looking to implement dummy variables on a dataset.

I am having a data set with few columns like below -

Dataset1
T1  T2  T3
A   C   B
A   C   B
A   C   B
A   D   C
B   D   C
B   E   F

I want to add dummy variables to this like dummy,A; dummy,B; dummy,C and so on.. And assign them values as 1 if it is present in either T1, T2 or T3, else 0.

So the final data set should look like -

T1  T2  T3  dummy,A dummy,B dummy,C dummy,D dummy,E dummy,F
A   C   B   1   1   1   0   0   0
A   C   B   1   1   1   0   0   0
A   C   B   1   1   1   0   0   0
A   D   C   1   0   1   1   0   0
B   D   C   0   1   1   1   0   0
B   E   F   0   1   0   0   1   1

So can anyone please suggest how I can achieve this?

Any help in this regard is really appreciated. Thanks!


Solution

  • We can use mtabulate from qdapTools. Transpose the 'Dataset1', convert it to data.frame, apply the mtabulate, change its column names (if needed) and cbind with the original 'Dataset1'

    library(qdapTools)
    d1 <- mtabulate(as.data.frame(t(Dataset1)))
    row.names(d1) <- NULL
    names(d1) <- paste0("dummy.", names(d1))
    cbind(Dataset1, d1)
    #   T1 T2 T3 dummy.A dummy.B dummy.C dummy.D dummy.E dummy.F
    #1  A  C  B       1       1       1       0       0       0
    #2  A  C  B       1       1       1       0       0       0
    #3  A  C  B       1       1       1       0       0       0
    #4  A  D  C       1       0       1       1       0       0
    #5  B  D  C       0       1       1       1       0       0
    #6  B  E  F       0       1       0       0       1       1