Search code examples
rrandomsample

Use sample() with conditions in R


I have a dataset I created to randomly assign treatments to experimental subjects. Individuals will be subjected to a treatment thrice. There are 7 treatments and I need to make sure that a single individual does not receive the same treatment more than once while still being randomly assigned. There are 35 individuals and 7 treatments so there are 5 replicates for each treatment.

the data:

set.seed(566)
treatments<-rep(c(2,4,8,16,32,64,100), each=5)
random_design<-data.frame(individual=c(1:35), trial1=sample(treatments), trial2=sample(treatments), trial3=sample(treatments))

As you can see, some individuals are subjected to the same treatment in different trials. Is there a way to impose a condition to sample(), so that individual x cannot have the same treatment than in a previous trial?


Solution

  • You seem to want to first randomly assign individuals three treatments, so if there are K treatments, and you want to randomly pick 3 without replacement, so do that for each individual, and then merge in the treatment effects. For example, using your numbers, and using data.table, here's a solution:

    set.seed(566)
    library(data.table)
    
    exp_num = 7
    #set up a data.table to hold treatment effects
    treat_dt = data.table("experiment_num" = 1:exp_num, "treatment_effect" = c(2,4,8,16,32,64,100))
    
    #now create a datatable of subjectsXtrials
    subj_dt = data.table(expand.grid("id" = 1:35, "trial" = paste0("trial",1:3)))
    
    #now randomly assign three experiments without replacement by id
    subj_dt[, exp_assigned := sample(1:exp_num,3, replace = F), by = id]
    
    #now merge in effects with treat_dt by experiment...
    subj_dt = merge(subj_dt,treat_dt, by.x = "exp_assigned",by.y = "experiment_num", all.x = T, all.y = F)
    
    #and youre done! option to get back a dataset where each id is a single row
    alt_dt = dcast(subj_dt[,.(id,trial,treatment_effect)], id ~ trial, value.var = "treatment_effect")
    

    Then the output looks as follows for alt_dt

    > head(alt_dt)
       id trial1 trial2 trial3
    1:  1    100     32      8
    2:  2    100     64     32
    3:  3      4     16      2
    4:  4    100     64      8
    5:  5      8     16      4
    6:  6     64    100      8