Search code examples
rtime-seriesstatistics-bootstrap

tseries - block bootstrap two series same order of resampling


for example

    require(tseries)
    series1 <- c(100,140,150,200,150,260,267,280,300,350)
    series2 <- c(500,600,250,300,350,500,100,130,50,60)
    data <- data.frame("series1" = series1, "series2" = series2)
ts  = tsbootstrap(data$series1, m=1, b=2, type="block", nb=10)
ts <- as.data.frame(ts)
head(ts)

> head(ts)
   V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
1 280 280 150 200 100 300 150 140 100 260
2 300 300 200 150 140 350 260 150 140 267
3 140 260 140 260 267 200 150 150 260 300
4 150 267 150 267 280 150 200 200 267 350
5 260 100 260 150 300 100 150 267 100 200
6 267 140 267 200 350 140 260 280 140 150

we now have blocks of two and stictched together in a different order. My question is, how can I 'reshuffle' series1 and series2 by block boostrap whilst keeping the blocks of both series in the same order?

For example.. if we set block by 2, it grabs 2 blocks lets say its position 5,6 out of 10. It grabs element 5,6 and moves it to position 1,2... this is for series1, for series 2, it grabs element 5,6 and moves to position 1,2. That way I keep the same order of the two series, is this possible?

So far I have tried to merge series1 and series2 to make a new column. That way when use bootstrap it moves the two series to the same position:

    data <- transform(data, ts.merge=paste(series1, series2, sep=","))
head(data)
  series1 series2 ts.merge
1     100     500  100,500
2     140     600  140,600
3     150     250  150,250
4     200     300  200,300
5     150     350  150,350
6     260     500  260,500

However, the , separator is not compatible with tseries...

Error in FUN(newX[, i], ...) : 
  NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning messages:
1: In as.vector(x, mode = "double") : NAs introduced by coercion
2: In as.vector(x, mode = "double") : NAs introduced by coercion

I also try separator "" however, not sure how I distinguish between two numerical values afterwards in order to separate them after (note my real life example is not simply triple digit values like shown above, otherwise I could split them in half afterwards)


Solution

  • Took me all day, but this is a manual solution which will resample per row:

        # Random Data
        data=matrix(rnorm(20*100), ncol = 2)
        data=as.data.frame(data)
        # Set block size
        reps <- NROW(data)/5 # Set group number
        data$id <- rep(1:reps,each=5) # each = 5 corresponds to number of blocks to bootstrap by (5 in this case)
        # Id data
        IDs<-unique(data$id)
        runs <- 1:1000
        temp <- list()
        # Function for bootstrap 1x data frame
        # subsets data by id number
        # Resamples the subsets
        bootSTRAP = function(x){
          for (i in 1:length(IDs)){ 
            temp[i] <- list(data[data$id==IDs[i],])
          }
          out <- sample(temp,replace=TRUE)
          df <- do.call(rbind, out)
        }
    
        # Loop for running it a 1000 times
        runs <- 1:1000
        run.output <- list()
        i=1
        for (i in 1:length(runs)){    # Length of optimization
          tryCatch({
            temp.1 <- bootSTRAP(runs[i])
            #cum_ret <- rbind.data.frame(cum_ret, temp)
            run.output[[i]] <- cbind.data.frame(temp.1)
            ptm0 <- proc.time()
            Sys.sleep(0.1)  
            ptm1=proc.time() - ptm0
            time=as.numeric(ptm1[3])
            cat('\n','Iteration',i,'took', time, "seconds to complete")
          }, error = function(e) { print(paste("i =", i, "failed:")) })
        }
    
    # cbind outputs
    master <- do.call(cbind, run.output)
    # Rename columns 
    col.ids <- rep(1:1000,each=3)
    cnames   <- paste(col.ids)
    colnames(master) <- cnames