Here is the algorithm of what I want to do with R:
ARIMA
model through arima.sim()
function2s
, 3s
, 4s
, 5s
, 6s
, 7s
, 8s
, and 9s
.ARIMA
model from the subseries from each block size through auto.arima()
function.RMSE
.The below R
function get that done.
## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = detectCores())
## simulate ARIMA(1,0, 0)
#n=10; phi <- 0.6; order <- c(1, 0, 0)
bootstrap1 <- function(n, phi){
ts <- arima.sim(n, model = list(ar=phi, order = c(1, 0, 0)), sd = 1)
########################################################
## create a vector of block sizes
t <- length(ts) # the length of the time series
lb <- seq(n-2)+1 # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
########################################################
## This section create matrix to store block means
BOOTSTRAP <- matrix(nrow = 1, ncol = length(lb))
colnames(BOOTSTRAP) <-lb
########################################################
## This section use foreach function to do detail in the brace
BOOTSTRAP <- foreach(b = 1:length(lb), .combine = 'cbind') %do%{
l <- lb[b]# block size at each instance
m <- ceiling(t / l) # number of blocks
blk <- split(ts, rep(1:m, each=l, length.out = t)) # divides the series into blocks
######################################################
res<-sample(blk, replace=T, 10) # resamples the blocks
res.unlist <- unlist(res, use.names = FALSE) # unlist the bootstrap series
train <- head(res.unlist, round(length(res.unlist) - 10)) # Train set
test <- tail(res.unlist, length(res.unlist) - length(train)) # Test set
nfuture <- forecast::forecast(train, model = forecast::auto.arima(train), lambda=0, biasadj=TRUE, h = length(test))$mean # makes the `forecast of test set
RMSE <- Metrics::rmse(test, nfuture) # RETURN RMSE
BOOTSTRAP[b] <- RMSE
}
BOOTSTRAPS <- matrix(BOOTSTRAP, nrow = 1, ncol = length(lb))
colnames(BOOTSTRAPS) <- lb
BOOTSTRAPS
return(list(BOOTSTRAPS))
}
Calling the function
bootstrap1(10, 0.6)
I get the below result:
## 2 3 4 5 6 7 8 9
## [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382
I want to repeat the above step 1
to step 4
chronologically, then I think of Monte Carlo
technology in R
. Thus, I load its package and run the below function:
param_list=list("n"=10, "phi"=0.6)
library(MonteCarlo)
MC_result<-MonteCarlo(func = bootstrap1, nrep=3, param_list = param_list)
expecting to get a like of the below result in matrix
form:
## [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382
## [2,] 0.8909836 0.8457537 1.095148 0.8918468 0.8913282 0.7894167 0.8911484 0.8694729
## [3,] 1.586785 1.224003 1.375026 1.292847 1.437359 1.418744 1.550254 1.30784
but I get the following error message:
Error in MonteCarlo(func = bootstrap1, nrep = 3, param_list = param_list) : func has to return a list with named components. Each component has to be scalar.
How can I find my way to obtain a desired result like the above and make the result reproducible?
You get this error message because MonteCarlo expects bootstrap1()
to accept one parameter combination for the simulation and that it only returns one value (RMSE
) per replication. This is not the case here since the block length (lb
) is determined by the length of the simulated time series (n
) within bootstrap1
and so you will get results for n - 2
block lengths for each call.
A solution is to pass the block length as a parameter and rewrite bootstrap1()
appropriately:
library(MonteCarlo)
library(forecast)
library(Metrics)
# parameter grids
n <- 10 # length of time series
lb <- seq(n-2) + 1 # vector of block sizes
phi <- 0.6 # autoregressive parameter
reps <- 3 # monte carlo replications
# simulation function
bootstrap1 <- function(n, lb, phi) {
#### simulate ####
ts <- arima.sim(n, model = list(ar = phi, order = c(1, 0, 0)), sd = 1)
#### devide ####
m <- ceiling(n / lb) # number of blocks
blk <- split(ts, rep(1:m, each = lb, length.out = n)) # divide into blocks
#### resample ####
res <- sample(blk, replace = TRUE, 10) # resamples the blocks
res.unlist <- unlist(res, use.names = FALSE) # unlist the bootstrap series
#### train, forecast ####
train <- head(res.unlist, round(length(res.unlist) - 10)) # train set
test <- tail(res.unlist, length(res.unlist) - length(train)) # test set
nfuture <- forecast(train, # forecast
model = auto.arima(train),
lambda = 0, biasadj = TRUE, h = length(test))$mean
### metric ####
RMSE <- rmse(test, nfuture) # return RMSE
return(
list("RMSE" = RMSE)
)
}
param_list = list("n" = n, "lb" = lb, "phi" = phi)
To run the simulation, pass the parameters as well as bootstrap1()
to MonteCarlo()
. For the simulation to be carried out in parallel you need to set the number of cores via ncpus
. The MonteCarlo package uses snowFall, so it should run on Windows.
Note that I also set raw = T
(otherwise the outcomes would be averages over all replications). Setting the seed before will make the results reproducible.
set.seed(123)
MC_result <- MonteCarlo(func = bootstrap1,
nrep = reps,
ncpus = parallel::detectCores() - 1,
param_list = param_list,
export_also = list(
"packages" = c("forecast", "Metrics")
),
raw = T)
The result is an array. I think it's best to transform it into a data.frame via MakeFrame()
:
Frame <- MakeFrame(MC_result)
It's easy to get a reps x lb
matrix though:
matrix(Frame$RMSE, ncol = length(lb), dimnames = list(1:reps, lb))