at the moment I try to get the seasonal-component of my data.
To do so I create a ts
via tk_ts
from a tibble of dates and value.
Unfortunately on of my data sets starts at 2011-07-01 and runs to 2018-05-01 (with missing data that I already filled with pad
from the padr
lib).
Since a ts
with frequency = 12
has to start at the first of January I can't model this data with a ts
. So I tried to create a xts
from my data and cast it to a ts
, but either I can't make the frequency work or the data is off.
Here is my MWE:
library(tidyquant)
library(timetk)
raw_data <- tibble(Date = c(as.Date("2011-07-01"), as.Date("2011-08-01"),
as.Date("2011-09-01"), as.Date("2011-10-01"),
as.Date("2011-11-01"), as.Date("2011-12-01"),
as.Date("2012-01-01"), as.Date("2012-02-01")),
Value = c(1,4,1,4,1,4,1,4))
# And so on, till 2018-05-01 and with reasonable values
tk_ts(raw_data, select = Value, start = 2011, frequency = 12)
# Leads to:
#
# Jan Feb Mar Apr May Jun Jul Aug
# 2011 1 4 1 4 1 4 1 4
#
# which is bad since my first date is 2011-07-01 not 2011-01-01.
xts_data <- xts(raw_data$Value, order.by = raw_data$Date, frequency = 12)
# xts_data Leads to, which is fine:
#
# [,1]
# 2011-07-01 1
# 2011-08-01 4
# 2011-09-01 1
# 2011-10-01 4
# 2011-11-01 1
# 2011-12-01 4
# 2012-01-01 1
# 2012-02-01 4
as.ts(xts_data, start = start(xts_data), end = end(xts_data))
# Leads to:
#
# Time Series:
# Start = 15156
# End = 15371
# Frequency = 1
# [1] 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1
# [52] 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
# [103] 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1
# [154] 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4
# [205] 1 4 1 4 1 4 1 4 1 4 1 4
#
# Which is totaly bad since there are more than the original 8 values.
as.ts(xts_data, start = start(xts_data))
# Leads to:
#
# Time Series:
# Start = 15156
# End = 15163
# Frequency = 1
# [1] 1 4 1 4 1 4 1 4
#
# Which is bad since the Frequency is off
# and I need it to be ok for the decompose.
as.ts(xts_data, start = start(xts_data), end = end(xts_data), frequency = 12)
# Leads to:
#
# Error in ts(coredata(x), frequency = frequency(x), ...) :
# formal argument "frequency" matched by multiple actual arguments
attr(xts_data, 'frequency') <- 12
as.ts(xts_data, start = start(xts_data))
# Leads to:
#
# Jan Feb Mar Apr May Jun Jul Aug
# 15156 1 4 1 4 1 4 1 4
#
# Which is as bad as the first example
So how can I generate a decompose (to get the seasonal component) of data that dose not start at the first of January?
You can try a simple addition to your start
argument specifying the month number as well (07 in this case).
raw_data <- tibble(Date = c(as.Date("2011-07-01"), as.Date("2011-08-01"),
as.Date("2011-09-01"), as.Date("2011-10-01"),
as.Date("2011-11-01"), as.Date("2011-12-01"),
as.Date("2012-01-01"), as.Date("2012-02-01")),
Value = c(1,4,1,4,1,4,1,4))
# And so on, till 2018-05-01 and with reasonable values
tk_ts(raw_data, select = Value, start = c(2011,07), frequency = 12)
This results in the following output:
tk_ts(raw_data, select = Value, start = c(2011,07), frequency = 12)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011 1 4 1 4 1 4
2012 1 4
Hope that helps with what you are trying to achieve in the subsequent steps.