I have the following code that samples 1 row 5 times, 2 rows 5 times, 3 rows 5 times and so on.. After running the lapply and converting it to a dataframe to make comparisons I need a way to alter the ID variable to act as my groups. So rows 1:5 of "want" would be "group 1", rows 6:15 would be "group 2", 16:30 would be "group 3" and so on... These are the groupings because group one only has one replicate of each number in the ID column, group 2 has two replicates, group 3 has 3 replicates and so on.
Code
iris<- iris
select_rows <- 1:4
n_times <- 5
inds <- nrow(iris)
result <- lapply(select_rows, function(x)
replicate(n_times, iris[sample(inds, x), ], simplify = FALSE))
want<- bind_rows(result, .id = 'source')
View(want)
Thinking about running an ANOVA on each column for example, the ID column would not provide sufficient groupings of observations.
I suppose I could use a combo of ifelse
and mutate
to manually go through and assign the rows to certain groups, but I hope to avoid this as I will need to do this for several varying dataframes.
I also tried the following code to assign groups over a sequence, but realized it wouldn't work because the numbers of observations in each group are not the same:
final<- want %>% mutate(Group = rep(seq(1,ceiling(nrow(want)/5)),each = 5))
Any help would be appreciated.
Use the times
argument to rep
to get five 1's, ten 2's, fifteen 3's, etc.
dat$id <- rep(1:3, times=1:3*5)