I'm trying to find a way to sample N
whole groups from a dataframe.
For example, if we had the below dataframe:
group value
1 a 1
2 a 2
3 a 3
4 b 4
5 b 5
6 c 6
7 d 7
8 d 8
9 d 9
10 d 10
Code
data.frame(group = c(rep("a", 3),
rep("b", 2),
"c",
rep("d", 4)),
value = 1:10)
If we wanted to sample n = 2
groups, I'd like my output to be something like:
group value
1 a 1
2 a 2
3 a 3
4 c 6
if for example the n = 2
groups selected for sampling were a
and c
I tried using group_by(group) %>% slice_sample(n = 2)
however that gives a sample of two for every group as opposed to every observation for two groups, which is what I am after.
Ideally a tidyverse solution would be best, but it might require a new function.
Thankyou!
set.seed(123)
filter(df, group %in% sample(unique(df$group), 2))
group value
1 c 6
2 d 7
3 d 8
4 d 9
5 d 10