Differenciated sampling rate by group

For a machine learning model training, I'm trying to sample a dataframe that has a grouping variable, so that each group is treated with a different sampling rule. For instance, my data:

df = data.frame(value = 1:10, label=c("a", "a", "b", rep("c", 7)))

For groups of size under, say, 3, I want to take the whole group and no more, and for bigger groups I want to take a sample of size 3 without replacement.

So here, the result could be: df[c(1:3, 6,9,10),]

If I use group_by and sample_n, I get an size error. I thought of going "manual" with splits and differentiated sampling and then bind again the rows, but is there a more efficient and direct way?

Solution

Using the size of the group n(), in sample_n.

df %>% group_by(label) %>% sample_n(min(n(), 3))

# A tibble: 6 x 3
# Groups:   label [3]
#  value label     n
#  <int> <fct> <int>
#1     1 a         2
#2     2 a         2
#3     3 b         1
#4     5 c         7
#5    10 c         7
#6     4 c         7