Search code examples
rtestingmeandifferencestatistics-bootstrap

R strucchange bootstrap test statistic due to nonspherical disturbances


I am trying to find a structural break in the mean of a time-series that is skewed, fat-tailed, and heteroskedastic. I apply the Andrews(1993) supF-test via the strucchange package. My understanding is that this is valid even with my nonspherical disturbances. But I would like to confirm this via bootstrapping. I would like to estimate the max t-stat from a difference in mean test at each possible breakpoint (just like the Andrews F-stat) and then bootstrap the critical value. In other words, I want to find my max t-stat in the time-ordered data. Then scramble the data and find the max t-stat in the scrambled data, 10,000 times. Then compare the max t-stat from the time-ordered data to a critical value given by the rank 9,500 max t-stat from the unordered data. Below I generate example data and apply the Andrews supF-test. Is there any way to "correct" the Andrews test for nonspherical disturbances? Is there any way to do the bootstrap I am trying to do?

library(strucchange)
Thames <- ts(matrix(c(rlnorm(120, 0, 1), rlnorm(120, 2, 2), rlnorm(120, 4, 1)), ncol = 1), frequency = 12, start = c(1985, 1))
fs.thames <- Fstats(Thames ~ 1)
sctest(fs.thames)

Solution

  • (1) Skewness and heavy tails. As usual in linear regression models, the asymptotic justification for the inference does not depend on normality and also holds for any other error distribution given zero expectation, homoscedasticity, and lack of correlation (the usual Gauss-Markov assumptions). However, if you have a well-fitting skewed distribution for your data of interest, then you might be able to increase efficiency by basing your inference on the corresponding model. For example, the glogis package provides some functions for structural change testing and dating based on a generalized logistic distribution that allows for heavy tails and skewness. Windberger & Zeileis (2014, Eastern European Economics, 52, 66–88, doi:10.2753/EEE0012-8775520304) used this to track changes in skewness of inflation dynamics over time. (See ?breakpoints.glogisfit for a worked example.) Furthermore, if the skewness itself is not really of interest then a log or sqrt transformation might also be good enough to make the data more "normal".

    (2) Heteroscedasticity and autocorrelation. As usual in linear regression models, the standard errors (or more broadly the covariance matrix) is not consistent in the presence of heteroscedasticity and/or autocorrelation. One can either try to include this explicitly in the model (e.g., an AR model) or treat it as a nuisance term and employ heteroscedasticity and autocorrelation consistent (HAC) covariance matrices (e.g., Newey-West or Andrews' quadratic spectral kernal HAC). The function Fstats() in strucchange allows to plug in such estimators, e.g., from the sandwich package. See ?durab for an example using vcovHC().

    (3) Bootstrap and permutation p-values. The "scrambling" of the time series you describe above sounds more like applying permutations (i.e., sampling without replacement) rather than bootstrap (i.e., sampling with replacement). The former is feasible if the errors are uncorrelated or exchangeable. If you are regressing just on a constant, then you can employ the function maxstat_test() from the coin package to carry out the supF test. The test statistic is computed in a somewhat different way, however, this can be shown to be equivalent to the supF test in the constant-only case (see Zeileis & Hothorn, 2013, Statistical Papers, 54, 931–954, doi:10.1007/s00362-013-0503-4). If you want to perform the permutation test in a more general model, then you would have to do the permutations "by hand" and simply store the test statistic from each permutation. Alternatively, the bootstrap can be applied, e.g., via the boot package (where you would still need to write your own small function that computes the test statistic from a given bootstrap sample). There are also some R packages (e.g., tseries) that implement bootstrap schemes for dependent series.