Search code examples
rtime-seriesdifference

How to select time series conditionally


Say I have a time series object, B_ts. Some series may require differencing to make them stationary, others perhaps will not. I would like to perform an augmented Dickey-Fuller test on all of the series, and to apply diff(x) to ONLY those series that yield a test statistic for which the p value is > 0.05 from the D-F test. Series for which the p value is already < 0.05 I wish to remain "untouched".

Is there a way of doing this in R?

So far, I have the following code for a time series object, B_ts:

B_ts <- ts(B) 

tseries::adf.test(B_ts) 

f1 = function(x){return(diff(x))}

C <- apply(B_ts,1, f1) #but only to rows that require differencing!

tseries::adf.test(C) #to see whether p value for all time series is now < 0.05 after differencing

Many thanks!


Solution

  • Here is a way to proceed one time with lapply, note that the final p-value for the 2nd serie is 0.065 so depending on the problem you have and your data you may want to lag more than once.

    # To choose example ts data 
    # data()
    tseries <- list("t1" = AirPassengers, "t2" = BJsales) ;
    
    # apply your test to the list of series
    adf <- lapply(tseries, function(x) tseries::adf.test(x)$p.value)
    
    # index only series that need diff
    diff_index <- which(adf > 0.05)
    tseries_diff <- tseries ;
    tseries_diff[diff_index] <- lapply(tseries_diff[diff_index], diff) ;
    
    # verify
    adf <- lapply(tseries_diff, function(x) tseries::adf.test(x)$p.value)
    adf
    
    # choose if you want to iterate again / or if your want to find a smarter lag