Search code examples
rdplyrtime-series

Insufficient data to compute STL decomposition using tsfeatures in R


I'm trying to calculate some time series features and when using the tsfeatures function in R I get the following error:

Warning messages:
1: In .f(.x[[i]], ...) : Insufficient data to compute STL decomposition
2: In .f(.x[[i]], ...) : Insufficient data to compute STL decomposition

For each time series there's 365 days of measurement values so I'm confused about the warning of insufficient data.

I suspect the error may be related to the original data which I've tried coercing from panel data to time series in matrix format as below but I may have messed up:

library(tsfeatures)

set.seed(25)

inds <- seq(as.Date("2022-05-01"), as.Date("2023-04-30"), by = "day")

values <- c(rnorm(length(inds), mean = 20, sd = 2),rnorm(length(inds), mean = 200, sd = 20))

products <- rep(c("A","B"), each = 365)

df <- data.frame(c(inds,inds),products,values)

df_mts <- ts(matrix(df$values, ncol = n_distinct(df$products), nrow = 365), frequency = 365, start = c(2022, as.numeric(format(as.Date("2022-05-01"), "%j"))))

features <- bind_cols(
  tsfeatures(df_mts, c("acf_features","entropy","lumpiness","flat_spots","crossing_points")),
  tsfeatures(df_mts,"stl_features", s.window = "periodic", robust = TRUE)
)

Is this the correct approach to convert daily data to ts format?


Solution

  • You have one year of daily data, and have specified the seasonal period as 365. So STL can't estimate the seasonality as it needs two full years of data to estimate the seasonal component.

    If you want to use weekly seasonality instead of annual seasonality, just set frequency to 7. Then your code will work.