Search code examples
rreshape2

Convert a dataframe to presence absence matrix


I have a table which has unequal number of element in string format

File1 A  B  C
File2 A  B  D
File3 E  F

I want to convert into a format as follows

        A B C D E F
File1   1 1 1 0 0 0 
FIle2   1 1 0 1 0 0
File3   0 0 0 0 1 1

I tried to do it using reshape2 but was not successful.

Sample data:

mydata <- structure(list(V1 = c("File1", "File2", "File3"), 
                         V2 = c("A", "A", "E"), V3 = c("B", "B", "F"), 
                         V4 = c("C", "D", "")), 
                   .Names = c("V1", "V2", "V3", "V4"), 
                   class = "data.frame", row.names = c(NA, -3L))

Solution

  • One possibility:

    library(reshape2)
    df2 <- melt(df, id.var = "V1")
    with(df2, table(V1, value))
    
    #         value
    # V1      A B C D E F
    #   File1 1 1 1 0 0 0
    #   File2 1 1 0 1 0 0
    #   File3 0 0 0 0 1 1