Search code examples
rdplyrreshape2melt

Structure data for 2x2x2 ANOVA in R


I have the following dataset, of which a subset is:

structure(list(Sex = c("Male", "Male", "Female", "Male", "Male", "Male"
), AgeGroup = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Young", 
"Old"), class = "factor"), FPT_Values_LEFT = c(0.615962446, 
NA, 0.466983199, 0.607728601, 0.534221292, 0.577650726), FPT_Values_RIGHT = c(0.534368277, 
NA, 0.456424206, 0.589604795, 0.564736903, 0.484459132)), row.names = c(NA, 
6L), class = "data.frame")

So:

     Sex AgeGroup FPT_Values_LEFT FPT_Values_RIGHT
1   Male      Old       0.6159624        0.5343683
2   Male      Old              NA               NA
3 Female      Old       0.4669832        0.4564242
4   Male      Old       0.6077286        0.5896048
5   Male      Old       0.5342213        0.5647369
6   Male      Old       0.5776507        0.4844591

I would like to run an ANOVA of AgeGroup*SIDE*value but need to get the dataset to include a side column that corresponds to either FPT_Values_LEFT or FPT_Value_RIGHT.

I have used varitions of melt but its not accurately lining up by side with the values.

DF$SIDE <- c("Left", "Right") 

reshape::melt(id="SIDE")
or
reshape::melt(id=c("SIDE", "AgeGroup"))

reshape::melt(id=c("SIDE", "AgeGroup"), measure.vars = c("FPT_Value_LEFT", "FPT_Value_RIGHT"))

I know I can transpose and append the data to move the right values under left, and copy the sex and age columns to match, then add a side column. I am not sure if there is a more efficient method though.

Expected output:

   Sex AgeGroup FPT_Values         SIDE
1   Male      Old       0.6159624  Left
2   Male      Old              NA  Left
3 Female      Old       0.4669832  Left
4   Male      Old       0.6077286  Left
5   Male      Old       0.5342213  Left
6   Male      Old       0.5776507  Left
1   Male      Old     0.5343683    Right
2   Male      Old            NA    Right
3 Female      Old     0.4564242    Right
4   Male      Old     0.5896048    Right
5   Male      Old     0.5647369    Right
6   Male      Old     0.4844591    Right



Solution

  • Using data.table you can do:

    > library(data.table)
    > melt(as.data.table(DF), id = c("Sex", "AgeGroup"))[, variable := sub("FPT_Values_", "", variable)][]
           Sex AgeGroup variable     value
     1:   Male      Old     LEFT 0.6159624
     2:   Male      Old     LEFT        NA
     3: Female      Old     LEFT 0.4669832
     4:   Male      Old     LEFT 0.6077286
     5:   Male      Old     LEFT 0.5342213
     6:   Male      Old     LEFT 0.5776507
     7:   Male      Old    RIGHT 0.5343683
     8:   Male      Old    RIGHT        NA
     9: Female      Old    RIGHT 0.4564242
    10:   Male      Old    RIGHT 0.5896048
    11:   Male      Old    RIGHT 0.5647369
    12:   Male      Old    RIGHT 0.4844591
    

    If you want to continue using melt from "reshape2" rather than using data.table, you can achieve the same by doing:

    transform(reshape2::melt(DF, c("Sex", "AgeGroup")), variable = sub("FPT_Values_", "", variable))
    

    However, I would recommend updating to data.table or converting your code to using tidyr.