I have a dataset that shows the walking speed of different age classes, measured in two different periods. Here is a reproducible example:
Period<-rep(c(1,2),times=c(9,18))
Class<-rep(c("child","teen","adult","toddler","child","teen","adult"),times=c(2,3,4,3,4,5,6))
Speed<-c(2,3,3,4,5,5,4,4,6,1,0.7,0.3,1,2,3,2,2,5,5,4,2,3,5,4,6,5,4)
data<-data.frame(Period,Class,Speed)
Period Class Speed
1 1 child 2.0
2 1 child 3.0
3 1 teen 3.0
4 1 teen 4.0
5 1 teen 5.0
6 1 adult 5.0
7 1 adult 4.0
8 1 adult 4.0
9 1 adult 6.0
10 2 toddler 1.0
11 2 toddler 0.7
12 2 toddler 0.3
13 2 child 1.0
14 2 child 2.0
15 2 child 3.0
16 2 child 2.0
17 2 teen 2.0
18 2 teen 5.0
19 2 teen 5.0
20 2 teen 4.0
21 2 teen 2.0
22 2 adult 3.0
23 2 adult 5.0
24 2 adult 4.0
25 2 adult 6.0
26 2 adult 5.0
27 2 adult 4.0
I want to create a summary table with the following attributes:
I would also like to add three "total" rows:
The resulting summary table should look like this:
Class Period mean min max SE n
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 adult 1 4.75 4 6 0.479 4
2 adult 2 4.5 3 6 0.428 2
3 child 1 2.5 2 3 0.5 3
4 child 2 2 1 3 0.408 6
5 teen 1 4 3 5 0.577 4
6 teen 2 3.6 2 5 0.678 5
7 toddler 2 0.667 0.3 1 0.203 3
8 Total.Period2 2 4 2 6 0.408 9
9 Total.Period1 1 3.06 0.3 6 0.414 18
10 Total NA 3.37 0.3 6 0.315 27
I can currently do this using functions from the tidyverse and rbind() in R. I've also been using the function std.error from the package plotrix() to calculate the standard errors. However, this is quite a lengthy process and I'd like to make it more efficient. Any suggestions?
Try this to calculate the summary:
library(tidyverse)
rbind(data %>% group_by(Class, Period) %>% summarise(mean = mean(Speed), min = min(Speed), max = max(Speed), SE = sd(Speed), n = n()) %>% ungroup(),
cbind("Class" = c("Period.1","Period.2"),data %>% group_by(Period) %>% summarise(mean = mean(Speed), min = min(Speed), max = max(Speed), SE = sd(Speed), n = n()) %>% ungroup()),
cbind("Class" = "Total", "Period" = NA,data %>% summarise(mean = mean(Speed), min = min(Speed), max = max(Speed), SE = sd(Speed), n = n()) %>% ungroup()))
As far as I know, this is the fastest way to do it