Search code examples
rarraysdataframefunctiondplyr

Summarizing an array into a dataframe in R


I wonder how to turn my TABLE below to my Desired_output?

In my Desired_output, n_study is the number of rows in TABLE for which the value corresponding to a column other than study is NOT 0.

For example, for the first element in TABLE (, , treat_grp = conventional), n_study for baseline is 1 because there is only 1 row for baseline with a non 0 element, and so on.

Also, in my Desired_output, sum is simply the column sum of each element of TABLE.

Is my Desired_output possible in R?

(I tried addmargins(TABLE) %>% as.data.frame.array() without success)

DATA <- structure(list(study = c(1, 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 
       4, 5, 5), time = c("baseline", "posttest1", "posttest2", "baseline", 
       "posttest1", "posttest2", "posttest1", "posttest1", "posttest1", 
        "posttest1", "posttest1", "posttest1", "posttest1", "posttest1", 
         "posttest2"), treat_grp = c("conventional", "conventional", "conventional", 
           "conventional", "conventional", "conventional", "conventional", 
            "framework_notes", "conventional", "conventional", "conventional", 
             "conventional", "conventional", "vocabulary_notebook", "vocabulary_notebook"
               )), row.names = c(NA, -15L), class = "data.frame")

TABLE <- table(DATA)

Desired_output <- read.table(header=T,text="
n_study     time      treat_grp       sum
  1     baseline   conventional         2
  4     posttest1  conventional         8
  1     posttest2  conventional         2
  0     baseline   framework_notes      0
  1     posttest1  framework_notes      1
  0     posttest2  framework_notes      0
  0     baseline   vocabulary_notebook  0
  1     posttest1  vocabulary_notebook  1
  1     posttest2  vocabulary_notebook  1")



Solution

  • Here is a base R solution:

    data.frame(table(DATA))|>
      aggregate(cbind(n_study = Freq > 0, sum = Freq) ~ time + treat_grp, sum)
    
           time           treat_grp n_study sum
    1  baseline        conventional       1   2
    2 posttest1        conventional       4   8
    3 posttest2        conventional       1   2
    4  baseline     framework_notes       0   0
    5 posttest1     framework_notes       1   1
    6 posttest2     framework_notes       0   0
    7  baseline vocabulary_notebook       0   0
    8 posttest1 vocabulary_notebook       1   1
    9 posttest2 vocabulary_notebook       1   1