Search code examples
rwindowtime-seriessubset

Time Series Subsetting by Integers


I've seen a lot of posts in regards to subsetting time series through specific date requirements, but I can not figure out how to subset based on integers. Consider:

# create dummy data
data <- ts(seq_len(96), start=c(2009,1), f=12)
# create training data
training.set <- ts(data[1:(length(data)-8)], start=c(2009,1), frequency=12)

# I want to remove the last 8 values (or any integer) and use that as a test set while retaining the correct dates
test.set <- ts(data[(length(data)-8+1):length(data)])
test.set # start/end aren't retained for the test set

Time Series:
Start = 1 
End = 8 
Frequency = 1 
[1] 89 90 91 92 93 94 95 96

I know I can specify the new start/end dates on the test set explicitly, but that won't work for my use. I'm trying to find a way to automatically do that so a function I'm writing can handle any dates based on the input time series and subset both training and test sets (based on any integer < length of the input series).


Solution

  • Note that you cannot have arbitrary subsetting because "ts" class can only represent regularly spaced series; however, you can subset an interval.

    1) base Here is a base solution which subsets time and then uses that as input to window. (Had the interval not ended at the end of the series we would have had to use end= as well.)

    window(data, start = tail(time(data), 8)[1])
    

    giving:

         May Jun Jul Aug Sep Oct Nov Dec
    2016  89  90  91  92  93  94  95  96
    

    2) zoo We can work directly with the time series if we first convert to zoo. After conversion take the subset and convert back (or omit the as.ts leaving it as a zoo object and work with that):

    library(zoo)
    
    as.ts(tail(as.zoo(data), 8))
    

    2a) Here is a variation of (2):

    as.ts(as.zoo(data)[seq(to = length(data), length = 8)])
    

    (2) and (2a) give the same answer as (1).