I am trying to sample two data tables on a condition, then combine the columns of the two resulting samples and replicate the these steps and append the resulting samples in a new data table. Extract of the two tables (they do not have the sample length):
data1
month1 year
1: 1 2014
2: 2 2015
3: 3 2016
..
data2
month2
1: 4
2: 5
3: 6
..
first sample:
s1 = sample(data1[month = i ], 100, replace=TRUE)
where i
goes from 1 to n
second sample:
s2 = sample(data2[month > i ], 100, replace=TRUE)
where i
should be greater than the month selected for s1.
The two samples should be combined in a new data table like dt1 = cbind(s1,s2)
I want to repeat these steps for every month i and create a new data set with all the resulting samples (pseudo-code):
for(i in 1:10){
s1_i = sample(data1[month = i ], 100, replace=TRUE)
s2_i = sample(data2[month > i ], 100, replace=TRUE)
new_i = cbind(s1_i,s2_i)
}
allsamples = rbind(new_1,new_2,new_3,...)
I have trouble writing this loop, it should not create data sets for every step, but create only the allsamples dataset, where all samples are combined.
Here is my solution:
newsample =list()
begin_time = 1
end_time = 20
for(i in begin_time:end_time){
datasub1 <-data1[data1$var == i,] #filter data on condition
s1 <- datasub1[sample(nrow( datasub1), 10, replace=T), ] #sample
datasub2 <- data2[data2$var2 > i,]
s2 <- datasub2[sample(nrow(datasub2), 10, replace=T), ]
newsample[[i-(begin_time-1])] <- cbind(s1,s2) #combine and store in list
}
allsample = rbindlist(newsample) #stack samples as data table