Search code examples
rrandomr-factor

How to generate a random treatment variable by factor?


Define

x <- data.frame(
     ID=letters[1:10],
     class = as.factor(c(rep(1,5),rep(2,5))),
     treat = rep(0,10))

s.t.

> x
   ID class treat
1   a     1     0
2   b     1     0
3   c     1     0
4   d     1     0
5   e     1     0
6   f     2     0
7   g     2     0
8   h     2     0
9   i     2     0
10  j     2     0

I have a treatment with two levels, 1 & 2. I want to assign exactly one one unit per class to each level s.t. that, after randomization, we get something like:

> x
   ID class treat
1   a     1     0
2   b     1     0
3   c     1     1
4   d     1     0
5   e     1     2
6   f     2     0
7   g     2     0
8   h     2     0
9   i     2     2
10  j     2     1 

s.t. units c and j get level 1 of treatment and e and i level 2.

How do I generate the treatment vector in R?


Solution

  • I'll assume you just want to assign one level 1 treatment and one level 2 treatment in each class. You can use the ddply function from the plyr package to do it easily:

      set.seed(1)
      require(plyr)
    > ddply(x, .(class), transform, 
            treat = replace(treat, sample(seq_along(treat),2), 1:2))
    
       ID class treat
    1   a     1     0
    2   b     1     1
    3   c     1     0
    4   d     1     0
    5   e     1     2
    6   f     2     0
    7   g     2     0
    8   h     2     1
    9   i     2     2
    10  j     2     0
    

    To explain: the ddply function splits the data-frame by the class variable, and within each data-frame, it "transforms" the treat column by replacing 2 randomly chosen entries by 1 and 2. The sample(...,2) function picks two random indices in the treat column. Other variants (e.g. assign more than 1 of each treatment type) can be done similarly.