Search code examples
rtime-seriesstatistics-bootstrap

R help- bootstrap time series data by row


Suppose a time series data like this

set.seed(1234)
x <- matrix( round(rnorm(200, 5)), ncol=10)
colnames(x) <-c('a1','a2','a3','a4','a5','b1','b2','b3','b4','b5')

I'm trying to pick every 3 adjacent measure of each variable to form a new table, and output would look like this:

*lower case a1-5, b1-5 are original data points

*upper case A1-3, B1-3 are new column names, since I only need 3 measures for each variable (variable a and b)

*1-1,1-2,1-3 meaning sample#1 can be divided into 3 subsets with replacement

Index A1 A2 A3 B1 B2 B3
1-1 a1 a2 a3 b1 b2 b3
1-2 a2 a3 a4 b2 b3 b4
1-3 a3 a4 a5 b3 b4 b5
2-1 a1 a2 a3 b1 b2 b3
2-2 a2 a3 a4 b2 b3 b4
2-3 a3 a4 a5 b3 b4 b5

This would be something similar to bootstrap with replacement, but the problems are a) it is time series and 2)there are multiple variables

Any suggestions would be appreciated!


Solution

  • Use some matrix indexing to grab the chunks and reshape back to a matrix:

    colidx <- 1:3 + rep(rbind(0:2, 5:7), each=3)
    matrix(x[cbind( rep(seq_len(nrow(x)), each=length(colidx)), colidx)],
           ncol=6, byrow=TRUE)
    
     #     [,1] [,2] [,3] [,4] [,5] [,6]
     #[1,]    4    5    6    5    5    5
     #[2,]    5    6    6    5    5    4
     #[3,]    6    6    5    5    4    2
     #[4,]    5    5    4    5    4    6
     #[5,]    5    4    8    4    6    5
     #[6,]    4    8    5    6    5    5
     # etc
    

    Compare to the first row of x and you can see the first 3 columns are the iterations of aX variables, and the last 3 columns are iterations of the bX variables.

     #[1,]  4  5  6  6  5  |  5  5  5  4  2
    
            4  5  6        |  5  5  5 
               5  6  6     |     5  5  4
                  6  6  5  |        5  4  2