Search code examples
rdata.tablesample

R Data.Table Random Sample Groups


DATA = data.table(STUDENT = c(1,1,2,2,2,2,2,3,3,3,3,3,4,
SCORE = c(5,6,8,3,14,5,6,9,0,12,13,14,19))

WANT = data.table(STUDENT = c(1,1,4),
SCORE = c(5,6,19))

I have DATA and wish to create WANT which takes a random sample of 2 STUDENT and includes all of their data. I present WANT as an example.

I try this with no success

WANT = WANT[ , .SD[sample(x = .N, size = 2)], by = STUDENT]

Solution

  • sample the unique values of STUDENT and filter all the rows for those STUDENT,

    library(data.table)
    set.seed(1357)
    DATA[STUDENT %in% sample(unique(STUDENT), 2)]
    
    #   STUDENT SCORE
    #1:       1     5
    #2:       1     6
    #3:       4    19