Here an example of my data.frame and let's assume that the date
col represents days:
df = read.table(text = 'ID date
a 1
a 2
a 3
a 4
a 7
a 12', header = TRUE)
So, my days here range from 1 to 12 and I would like to create 100 data.frames where for each one the date
(and ID
) col will be grouped randomly by 3 subsequent days.
e.g.
df1
ID date group
a 1 1 #group 1 = 1, 2, 3
a 2 1
a 3 1
a 4 2 # group 2 = 4, 5, 6
a 7 3 # group 3 = 7, 8, 9
a 12 4 # group 4 = 10, 11, 12
df2
ID date group
a 1 4
a 2 1 #group 1 = 2, 3, 4
a 3 1
a 4 1
a 7 2 #group 2 = 5, 6, 7 --- group 3 = 8, 9, 10
a 12 4 # group 4 = 11, 12 and start again from the beginning 1
df3
ID date group
a 1 1
a 2 1
a 3 2 #group 2 = 3, 4, 5
a 4 2
a 7 3 #group 3 = 6, 7, 8 -- group 4 = 9, 10, 11
a 12 1 #group 1 = 12, 1, 2
etc...
Note that the group
col groups the rows by 3 by considering subsequent days which not necessarily appear in the data.frame and the randomness of the whole trick is the start day of group 1.
Do you have any suggestion?
Not sure but you can create an empty list and fill new data frames similar to your example:
set=list()
for(i in 1:100) { set[[i]] = cbind(df,group=sample(rep(c(1,2),each=3))) }