Search code examples
rdataframerandomsample

How to sample across a dataset with two factors in it?


I have a dataframe with two species A and B and certain variables a b associated with the total of 100 rows.

I want to create a sampler such that in one set it randomly picks 6 rows reps from the df dataset. However, the samples for A must only come from rows associated with sp A from df, similarly from B. I want do this for 500 times over for each of species A and B.

I attempted a for loop and when I ran sampling it shows a single row with 6 columns. I would appreciate any guidance

a <- rnorm(100, 2,1)
b <- rnorm(100, 2,1)
sp <- rep(c("A","B"), each = 50)  

df <- data.frame(a,b,sp)

df.sample <- for(i in 1:1000){
             sampling <- sample(df[i,],6,replace = TRUE)
}

#Output in a single row
a     a.1 sp        b sp.1     a.2
1000 1.68951 1.68951  B 1.395995    B 1.68951

#Expected dataframe
df.sample
set rep a b sp
  1  1  1 9  A
  1  2  3 2  A
  1  3  0 2  A
  1  4  1 2  A
  1  5  1 6  A
  1  6  4 2  A
  2  1  1 2  B
  2  2  5 2  B
  2  3  1 2  B
  2  4  1 6  B
  2  5  1 8  B
  2  6  9 2  B
  ....

Solution

  • Here's how I would do it (using tidyverse):

    data:

    a <- rnorm(100, 2,1)
    b <- rnorm(100, 2,1)
    sp <- rep(c("A","B"), each = 50)  
    df <- data.frame(a,b,sp)
    
    
    # create an empty table with desired columns
    
    library(tidyverse)
    output <- tibble(a = numeric(), 
                     b = numeric(), 
                     sp = character(), 
                     set = numeric())
    
    # sampling in a loop
    
        set.seed(42)                    
        for(i in 1:500){
          samp1 <- df %>% filter(sp == 'A') %>% sample_n(6, replace = TRUE) %>% mutate(set = i)
          samp2 <- df %>% filter(sp == 'B') %>% sample_n(6, replace = TRUE) %>% mutate(set = i)
          output %>% add_row(bind_rows(samp1, samp2))  -> output
        }
    

    Result

    > head(output, 20)
    # A tibble: 20 × 4
           a     b sp      set
       <dbl> <dbl> <chr> <dbl>
     1 2.59  3.31  A         1
     2 1.84  1.66  A         1
     3 2.35  1.17  A         1
     4 2.33  1.95  A         1
     5 0.418 1.11  A         1
     6 1.19  2.54  A         1
     7 2.35  0.899 B         1
     8 1.19  1.63  B         1
     9 0.901 0.986 B         1
    10 3.12  1.75  B         1
    11 2.28  2.61  B         1
    12 1.37  3.47  B         1
    13 2.33  1.95  A         2
    14 1.84  1.66  A         2
    15 3.76  1.26  A         2
    16 2.96  3.10  A         2
    17 1.03  1.81  A         2
    18 1.42  2.00  A         2
    19 0.901 0.986 B         2
    20 2.37  1.39  B         2