I wonder how to turn my TABLE
below to my Desired_output
?
In my Desired_output
, n_study
is the number of rows in TABLE
for which the value corresponding to a column other than study
is NOT 0.
For example, for the first element in TABLE
(, , treat_grp = conventional
), n_study
for baseline
is 1
because there is only 1 row for baseline
with a non 0 element, and so on.
Also, in my Desired_output
, sum
is simply the column sum of each element of TABLE
.
Is my Desired_output
possible in R?
(I tried addmargins(TABLE) %>% as.data.frame.array()
without success)
DATA <- structure(list(study = c(1, 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4,
4, 5, 5), time = c("baseline", "posttest1", "posttest2", "baseline",
"posttest1", "posttest2", "posttest1", "posttest1", "posttest1",
"posttest1", "posttest1", "posttest1", "posttest1", "posttest1",
"posttest2"), treat_grp = c("conventional", "conventional", "conventional",
"conventional", "conventional", "conventional", "conventional",
"framework_notes", "conventional", "conventional", "conventional",
"conventional", "conventional", "vocabulary_notebook", "vocabulary_notebook"
)), row.names = c(NA, -15L), class = "data.frame")
TABLE <- table(DATA)
Desired_output <- read.table(header=T,text="
n_study time treat_grp sum
1 baseline conventional 2
4 posttest1 conventional 8
1 posttest2 conventional 2
0 baseline framework_notes 0
1 posttest1 framework_notes 1
0 posttest2 framework_notes 0
0 baseline vocabulary_notebook 0
1 posttest1 vocabulary_notebook 1
1 posttest2 vocabulary_notebook 1")
Here is a base R solution:
data.frame(table(DATA))|>
aggregate(cbind(n_study = Freq > 0, sum = Freq) ~ time + treat_grp, sum)
time treat_grp n_study sum
1 baseline conventional 1 2
2 posttest1 conventional 4 8
3 posttest2 conventional 1 2
4 baseline framework_notes 0 0
5 posttest1 framework_notes 1 1
6 posttest2 framework_notes 0 0
7 baseline vocabulary_notebook 0 0
8 posttest1 vocabulary_notebook 1 1
9 posttest2 vocabulary_notebook 1 1