Search code examples
rreshapedata-munging

Reshaping datas in a specific form


I've datas as follows, it is a but in reality i've few experiment, it is simplified dataset:

DF=structure(list(theoric = c("E", "E", "F", "F", "F"), observed = c("E", 
"E", "F", "F", "E"), experiment = c("RO(2)", "RO(2)", "RO(2)", "RO(2)", 
"RO(2)")), .Names = c("theoric", "observed", "experiment"), row.names = 2:6, class = "data.frame")

Now my datas has the following form:

  theoric observed  experiment
2       E        E RO(2)
3       E        E RO(2)
4       F        F RO(2)
5       F        F RO(2)
6       F        E RO(2)

Adn I want it to be reshaped as follows :

                  2 3 4 5 6
RO(2) theoric     E E F F F
RO(2) observed    E E F F E

What is the easiest way to do it ? I really have no idea how to do this. I tried

meltR <- melt(DF, id="experiment")

But i'm lost all correspondance between theoric and observed. Thanks a lot

EDIT : full dataset:

DF=structure(list(theoric = c("E", "E", "F", "F", "F", "E", "F", 
"F", "F", "F", "F", "E", "E", "E", "E"), observed = c("E", "E", 
"F", "F", "E", "F", "F", "F", "F", "F", "F", "E", "E", "E", "F"
), experiment = c("RO", "RO", "RO", "RO", "RO", "MO", "MO", "MO", 
"MO", "MO", "MO", "EL", "EL", "EL", "EL")), .Names = c("theoric", 
"observed", "experiment"), row.names = c(2L, 3L, 4L, 5L, 6L, 
24L, 25L, 26L, 27L, 28L, 29L, 21L, 22L, 23L, 13L), class = "data.frame")

output:

    col2 col1.2 col1.3 col1.4 col1.5 col1.6 col1.24 col1.25 col1.26
1   RO theoric      E      E      F      F      F    <NA>    <NA>    <NA>
6   MO theoric   <NA>   <NA>   <NA>   <NA>   <NA>       E       F       F
12  EL theoric   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
16 RO observed      E      E      F      F      E    <NA>    <NA>    <NA>
21 MO observed   <NA>   <NA>   <NA>   <NA>   <NA>       F       F       F
27 EL observed   <NA>   <NA>   <NA>   <NA>   <NA>    <NA>    <NA>    <NA>
   col1.27 col1.28 col1.29 col1.21 col1.22 col1.23 col1.13
1     <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
6        F       F       F    <NA>    <NA>    <NA>    <NA>
12    <NA>    <NA>    <NA>       E       E       E       E
16    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
21       F       F       F    <NA>    <NA>    <NA>    <NA>
27    <NA>    <NA>    <NA>       E       E       E       F

EDIT 2 : Add EL ouput

RO theoric     E E F F F
RO observed    E E F F E
MO theoric     E F F F F
MO observed    F F F F F
EL theoric     E E E E
EL observed    E E E F

Solution

  • Based on the expected output, we may need to create a column with row.names. Create a new dataset ('df2'), by unlisting the first two columns, replicating the 'experiment' column, and a rownames column. Then use reshape from base R to convert the 'long' format to 'wide'.

    df2 <- data.frame(col1 = unlist(DF[1:2], use.names=FALSE), 
          col2 = paste( rep(DF$experiment, 2),
        rep(colnames(DF)[1:2], each = nrow(DF))), col3 = rep(row.names(DF), 2))
    
    reshape(df2, idvar = "col2", direction="wide", timevar = "col3")
    #             col2 col1.2 col1.3 col1.4 col1.5 col1.6
    #1  RO(2) theoric      E      E      F      F      F
    #6 RO(2) observed      E      E      F      F      E
    

    Or using melt/dcast from data.table. Convert the 'data.frame' to 'data.table' keeping the row names (setDT(DF, keep.row.names = TRUE)), melt it to 'long' format, paste the 'experiment' and 'variable' column, and then dcast from 'long' to 'wide' format.

    library(data.table)
    dcast(melt(setDT(DF, keep.rownames = TRUE), id.var = c("rn", "experiment"))[,
        experiment := paste(experiment, variable)], experiment~rn, value.var = "value")
    #       experiment 2 3 4 5 6
    #1: RO(2) observed E E F F E
    #2:  RO(2) theoric E E F F F
    

    Update

    Using the new dataset,

    library(data.table)#v1.9.7+
    dcast(melt(setDT(DF), id.var = "experiment"), paste(experiment, 
        variable)~rowid(experiment, variable), value.var="value", fill="")
    #    experiment 1 2 3 4 5 6
    #1: EL observed E E E F    
    #2:  EL theoric E E E E    
    #3: MO observed F F F F F F
    #4:  MO theoric E F F F F F
    #5: RO observed E E F F E  
    #6:  RO theoric E E F F F