Suppose a time series data like this
set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)
colnames(x) <-c('a1','a2','a3','a4','a5','b1','b2','b3','b4','b5')
I'm trying to pick every 3 adjacent measure of each variable to form a new table, and output would look like this:
*lower case a1-5, b1-5 are original data points
*upper case A1-3, B1-3 are new column names, since I only need 3 measures for each variable (variable a and b)
*1-1,1-2,1-3 meaning sample#1 can be divided into 3 subsets with replacement
Index | A1 | A2 | A3 | B1 | B2 | B3 |
---|---|---|---|---|---|---|
1-1 | a1 | a2 | a3 | b1 | b2 | b3 |
1-2 | a2 | a3 | a4 | b2 | b3 | b4 |
1-3 | a3 | a4 | a5 | b3 | b4 | b5 |
2-1 | a1 | a2 | a3 | b1 | b2 | b3 |
2-2 | a2 | a3 | a4 | b2 | b3 | b4 |
2-3 | a3 | a4 | a5 | b3 | b4 | b5 |
This would be something similar to bootstrap with replacement, but the problems are a) it is time series and 2)there are multiple variables
Any suggestions would be appreciated!
Use some matrix indexing to grab the chunks and reshape back to a matrix:
colidx <- 1:3 + rep(rbind(0:2, 5:7), each=3)
matrix(x[cbind( rep(seq_len(nrow(x)), each=length(colidx)), colidx)],
ncol=6, byrow=TRUE)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 4 5 6 5 5 5
#[2,] 5 6 6 5 5 4
#[3,] 6 6 5 5 4 2
#[4,] 5 5 4 5 4 6
#[5,] 5 4 8 4 6 5
#[6,] 4 8 5 6 5 5
# etc
Compare to the first row of x
and you can see the first 3 columns are the iterations of aX
variables, and the last 3 columns are iterations of the bX
variables.
#[1,] 4 5 6 6 5 | 5 5 5 4 2
4 5 6 | 5 5 5
5 6 6 | 5 5 4
6 6 5 | 5 4 2