Search code examples
rmeltdummy-variable

R: factors across columns as dummy variable


I am working in R and i need on below problem. I have my data in below format.

Users   Lang_1  Lang_2  Lang_3  Lang_4  Lang_5
user_1  C       SAS     Python  SPSS    Java
user_2  R       C++     Java
user_3  SAS     R       Python  Octave
user_4  iPython SQL     R
user_5  SQL     Java    Dot Net Python

and need my output to be in below format

Users   C   R   SAS   iPython   SQL   C++   Java   Python   DotNet   SPSS   Octave
user_1  1   0   1       0       0     0     1       1       0        1      0
user_2  0   1   0       0       0     1     1       0      0         0      0
user_3  0   1   1       0       0     0     0       1      0         0      1
user_4  0   1   0       1       1     0     0       0      0         0      0
user_5  0   0   0       0       1     0     1       1      1         0      0

Trying to use above info for classification need.Please help me out.


Solution

  • library(reshape)
    

    #read the problem data-frame

    data <- read.csv(file.choose())
    

    #pass the index of id variable

    data_m <- melt(data,id.vars = 1)
    

    #remove the observations where value column in blank

    data_m <- data_m[-which(data_m$value==""),]
    

    # deleted variable column

    data_m <- data_m[,-2]
    

    #desired output by running below command

    cast(data_m,Users~value,length)