Search code examples
rtime-seriesxts

Decompose ts with odd starting month


at the moment I try to get the seasonal-component of my data. To do so I create a ts via tk_ts from a tibble of dates and value. Unfortunately on of my data sets starts at 2011-07-01 and runs to 2018-05-01 (with missing data that I already filled with pad from the padr lib).

Since a ts with frequency = 12 has to start at the first of January I can't model this data with a ts. So I tried to create a xts from my data and cast it to a ts, but either I can't make the frequency work or the data is off.

Here is my MWE:

library(tidyquant)
library(timetk)

raw_data <- tibble(Date = c(as.Date("2011-07-01"), as.Date("2011-08-01"),
                   as.Date("2011-09-01"), as.Date("2011-10-01"),
                   as.Date("2011-11-01"), as.Date("2011-12-01"),
                   as.Date("2012-01-01"), as.Date("2012-02-01")),
                  Value = c(1,4,1,4,1,4,1,4))
                  # And so on, till 2018-05-01 and with reasonable values

tk_ts(raw_data, select = Value, start = 2011, frequency = 12)
# Leads to:
# 
#      Jan Feb Mar Apr May Jun Jul Aug
# 2011   1   4   1   4   1   4   1   4
#
# which is bad since my first date is 2011-07-01 not 2011-01-01.

xts_data <- xts(raw_data$Value, order.by = raw_data$Date, frequency = 12)
# xts_data Leads to, which is fine:
# 
# [,1]
# 2011-07-01    1
# 2011-08-01    4
# 2011-09-01    1
# 2011-10-01    4
# 2011-11-01    1
# 2011-12-01    4
# 2012-01-01    1
# 2012-02-01    4

as.ts(xts_data, start = start(xts_data), end = end(xts_data))
# Leads to:
# 
# Time Series:
# Start = 15156 
# End = 15371 
# Frequency = 1 
# [1] 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1
# [52] 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
# [103] 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1
# [154] 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
# [205] 1 4 1 4 1 4 1 4 1 4 1 4
#
# Which is totaly bad since there are more than the original 8 values.

as.ts(xts_data, start = start(xts_data))
# Leads to:
# 
# Time Series:
#   Start = 15156 
# End = 15163 
# Frequency = 1 
# [1] 1 4 1 4 1 4 1 4
#
# Which is bad since the Frequency is off
#  and I need it to be ok for the decompose.

as.ts(xts_data, start = start(xts_data), end = end(xts_data), frequency = 12)
# Leads to:
# 
# Error in ts(coredata(x), frequency = frequency(x), ...) : 
#   formal argument "frequency" matched by multiple actual arguments

attr(xts_data, 'frequency') <- 12
as.ts(xts_data, start = start(xts_data))
# Leads to:
# 
# Jan Feb Mar Apr May Jun Jul Aug
# 15156   1   4   1   4   1   4   1   4
#
# Which is as bad as the first example

So how can I generate a decompose (to get the seasonal component) of data that dose not start at the first of January?


Solution

  • You can try a simple addition to your start argument specifying the month number as well (07 in this case).

    raw_data <- tibble(Date = c(as.Date("2011-07-01"), as.Date("2011-08-01"),
                                as.Date("2011-09-01"), as.Date("2011-10-01"),
                                as.Date("2011-11-01"), as.Date("2011-12-01"),
                                as.Date("2012-01-01"), as.Date("2012-02-01")),
                       Value = c(1,4,1,4,1,4,1,4))
    # And so on, till 2018-05-01 and with reasonable values
    
    tk_ts(raw_data, select = Value, start = c(2011,07), frequency = 12)
    

    This results in the following output:

    tk_ts(raw_data, select = Value, start = c(2011,07), frequency = 12)
         Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
    2011                           1   4   1   4   1   4
    2012   1   4 
    

    Hope that helps with what you are trying to achieve in the subsequent steps.