I have a dataframe with two species A
and B
and certain variables a
b
associated with the total of 100 rows.
I want to create a sampler such that in one set
it randomly picks 6 rows reps
from the df
dataset. However, the samples for A
must only come from rows associated with sp A
from df, similarly from B
. I want do this for 500 times over for each of species A
and B
.
I attempted a for loop and when I ran sampling
it shows a single row with 6 columns. I would appreciate any guidance
a <- rnorm(100, 2,1)
b <- rnorm(100, 2,1)
sp <- rep(c("A","B"), each = 50)
df <- data.frame(a,b,sp)
df.sample <- for(i in 1:1000){
sampling <- sample(df[i,],6,replace = TRUE)
}
#Output in a single row
a a.1 sp b sp.1 a.2
1000 1.68951 1.68951 B 1.395995 B 1.68951
#Expected dataframe
df.sample
set rep a b sp
1 1 1 9 A
1 2 3 2 A
1 3 0 2 A
1 4 1 2 A
1 5 1 6 A
1 6 4 2 A
2 1 1 2 B
2 2 5 2 B
2 3 1 2 B
2 4 1 6 B
2 5 1 8 B
2 6 9 2 B
....
Here's how I would do it (using tidyverse):
data:
a <- rnorm(100, 2,1)
b <- rnorm(100, 2,1)
sp <- rep(c("A","B"), each = 50)
df <- data.frame(a,b,sp)
# create an empty table with desired columns
library(tidyverse)
output <- tibble(a = numeric(),
b = numeric(),
sp = character(),
set = numeric())
# sampling in a loop
set.seed(42)
for(i in 1:500){
samp1 <- df %>% filter(sp == 'A') %>% sample_n(6, replace = TRUE) %>% mutate(set = i)
samp2 <- df %>% filter(sp == 'B') %>% sample_n(6, replace = TRUE) %>% mutate(set = i)
output %>% add_row(bind_rows(samp1, samp2)) -> output
}
Result
> head(output, 20)
# A tibble: 20 × 4
a b sp set
<dbl> <dbl> <chr> <dbl>
1 2.59 3.31 A 1
2 1.84 1.66 A 1
3 2.35 1.17 A 1
4 2.33 1.95 A 1
5 0.418 1.11 A 1
6 1.19 2.54 A 1
7 2.35 0.899 B 1
8 1.19 1.63 B 1
9 0.901 0.986 B 1
10 3.12 1.75 B 1
11 2.28 2.61 B 1
12 1.37 3.47 B 1
13 2.33 1.95 A 2
14 1.84 1.66 A 2
15 3.76 1.26 A 2
16 2.96 3.10 A 2
17 1.03 1.81 A 2
18 1.42 2.00 A 2
19 0.901 0.986 B 2
20 2.37 1.39 B 2