I have a dataframe that looks like this:
library(tidyverse)
x <- tibble(
batch = rep(c(1,2), each=10),
exp_id = c(rep('a',3),rep('b',2),rep('c',5),rep('d',6),rep('e',4))
)
I can run the code below to get the count perexp_id
:
x %>% group_by(batch,exp_id) %>%
summarise(count=n())
which generates:
batch exp_id count
<dbl> <chr> <dbl>
1 1 a 3
2 1 b 2
3 1 c 5
4 2 d 6
5 2 e 4
A really ugly way to generate the mean of these counts is:
x %>% group_by(batch,exp_id) %>%
summarise(count=n()) %>%
ungroup() %>%
group_by(batch) %>%
summarise(avg_exp = mean(count))
which generates:
batch avg_exp
<dbl> <dbl>
1 1 3.33
2 2 5
Is there a more succinct and "tidy" way generate this?
library(dplyr)
group_by(x, batch) %>%
summarize(avg_exp = mean(table(exp_id)))
# # A tibble: 2 x 2
# batch avg_exp
# <dbl> <dbl>
# 1 1 3.33
# 2 2 5