Search code examples
rcategorical-datadummy-variable

Convert data frame with dummy variables into categorical variables


I need to convert dummy into categorical variables. Being new to R i just know how to do it the other way round. Can someone point me in the right direction?

The dataframe is:

data <- data.frame(id=c(1,2,3,4,5,6,7,8,9), 
               red=c("1","0","1","0","1","0","0","0","0"),
               blue=c("1","1","1","1","0","1","1","1","0"),
               yellow=c("0","0","0","0","0","0","0","1","1"))

Input Dataframe

and expected output is:

output dataframe


Solution

  • One option with lapply by ignoring the first column (id), we check which columns have value 1 in it and replace them with the corresponding column names and others can be changed to NA.

    data[-1] <- lapply(names(data[-1]), function(x) ifelse(data[x] == 1, x, NA))
    
    data
    #  id  red blue yellow
    #1  1  red blue   <NA>
    #2  2 <NA> blue   <NA>
    #3  3  red blue   <NA>
    #4  4 <NA> blue   <NA>
    #5  5  red <NA>   <NA>
    #6  6 <NA> blue   <NA>
    #7  7 <NA> blue   <NA>
    #8  8 <NA> blue yellow
    #9  9 <NA> <NA> yellow
    

    Another approach without using lapply loop

    data[-1] <- ifelse(data[-1] == 1, names(data[-1])[col(data[-1])], NA)
    
    
    data
    #  id  red blue yellow
    #1  1  red blue   <NA>
    #2  2 <NA> blue   <NA>
    #3  3  red blue   <NA>
    #4  4 <NA> blue   <NA>
    #5  5  red <NA>   <NA>
    #6  6 <NA> blue   <NA>
    #7  7 <NA> blue   <NA>
    #8  8 <NA> blue yellow
    #9  9 <NA> <NA> yellow