Search code examples
rexperimental-design

Using R to Randomly Assign Treatment and Control Groups by single IDs


I would like to use R for solving a problem of experimental design in which I will randomly assign my experimental units to treatment or control groups. The problem is the following:

Let's say that I have 120 plants with unique IDs (subsetted in 4 different clones), 3 time points, 2 pathogens, 2 control groups. Therefore, for each time point I would like to assign for each clone: -3 of pathogen A, 3 of pathogen B, 2 of control A and 2 of control B.

clones <- c(rep("clone A", 30), rep("clone B", 30), rep("clone C", 30), rep("clone D", 30))
IDs <- 1:120
plants <- data.frame(IDs = IDs, 
                     clones = clones)

# How can I randomly assign the following for each IDs? 
control <- c("control A", "control B")
pathogen <- c("pathogen A", "Pathogen B")
time_point <- c("T1", "T2", "T3")

Thanks for the help!


Solution

  • library(tidyverse)
    
    # ensure to use the same kind of randomness
    # required for reproducibility
    set.seed(1)
    
    group <- c(
      rep("Pathogen A", 3),
      rep("Pathogen B", 3),
      rep("control A", 2),
      rep("control B", 2)
    )
    
    sampling <-
      # Every clone has all groups
      expand_grid(
        group,
        clone = c("Clone A", "Clone B", "Clone C", "Clone D"),
        time = c("T1", "T2", "T3")
      ) %>%
      arrange(clone) %>%
      mutate(id = row_number()) %>%
      # random group assignment stratified for each clone and time
      group_by(clone, time) %>%
      mutate(group = group %>% sample())
    
    sampling
    #> # A tibble: 120 × 4
    #> # Groups:   clone, time [12]
    #>    group      clone   time     id
    #>    <chr>      <chr>   <chr> <int>
    #>  1 control B  Clone A T1        1
    #>  2 Pathogen A Clone A T2        2
    #>  3 Pathogen B Clone A T3        3
    #>  4 Pathogen B Clone A T1        4
    #>  5 Pathogen A Clone A T2        5
    #>  6 control B  Clone A T3        6
    #>  7 control A  Clone A T1        7
    #>  8 Pathogen B Clone A T2        8
    #>  9 Pathogen A Clone A T3        9
    #> 10 Pathogen A Clone A T1       10
    #> # … with 110 more rows
    
    sampling %>%
      group_by(clone) %>%
      summarise(
        min_id = min(id),
        max_id = max(id)
      )
    #> # A tibble: 4 × 3
    #>   clone   min_id max_id
    #>   <chr>    <int>  <int>
    #> 1 Clone A      1     30
    #> 2 Clone B     31     60
    #> 3 Clone C     61     90
    #> 4 Clone D     91    120
    
    sampling %>%
      filter(clone == "Clone A")
    #> # A tibble: 30 × 4
    #> # Groups:   clone, time [3]
    #>    group      clone   time     id
    #>    <chr>      <chr>   <chr> <int>
    #>  1 control B  Clone A T1        1
    #>  2 Pathogen A Clone A T2        2
    #>  3 Pathogen B Clone A T3        3
    #>  4 Pathogen B Clone A T1        4
    #>  5 Pathogen A Clone A T2        5
    #>  6 control B  Clone A T3        6
    #>  7 control A  Clone A T1        7
    #>  8 Pathogen B Clone A T2        8
    #>  9 Pathogen A Clone A T3        9
    #> 10 Pathogen A Clone A T1       10
    #> # … with 20 more rows
    

    Created on 2022-04-15 by the reprex package (v2.0.1)