Search code examples
rforecasting

forecast with R


I have daily data of dengue index from January 2010 to July 2015:

date    dengue_index
1/1/2010    0.169194109
1/2/2010    0.172350434
1/3/2010    0.174939783
1/4/2010    0.176244642
1/5/2010    0.176658068
1/6/2010    0.177815751
1/7/2010    0.17893075
1/8/2010    0.1813232
1/9/2010    0.182199531
1/10/2010   0.185091158
1/11/2010   0.185267748
1/12/2010   0.185894524
1/13/2010   0.18511499
1/14/2010   0.188080728
1/15/2010   0.190019472
…   …
7/20/2015   0.112748885
7/21/2015   0.113246022
7/22/2015   0.111755091
7/23/2015   0.112164176
7/24/2015   0.11429011
7/25/2015   0.113951836
7/26/2015   0.11319131
7/27/2015   0.112918734

I want to predict the values until the end of 2016 using R.

library(forecast)
setwd("...")
dengue_series <- read.csv(file="r_wikipedia-googletrends-model.csv",head=TRUE,sep=";")
dengue_index <- ts(dengue_series$dengue_index, frequency=7)
plot(dengue_index)
# lambda=0 -> predict positive values
fit <- auto.arima(dengue_index, lambda=0)
fit
# predict until December 2016
forecast_series <- forecast(fit, 500)
forecast_series
plot(forecast_series)

Problem: the prediction is not good! enter image description here

How to improve the prediction?

Link to the data source: https://www.dropbox.com/s/wnvc4e78t124fkd/r_wikipedia-googletrends-model.csv?dl=0


Solution

  • You can try specifying as a multi-seasonal time series object msts, and then forecasting using tbats. tbats is referenced in the paper that David Arenburg mentions in the comments.

    Here's an example pulled from example data in the forecast package for the taylor dataset, which has seasonal periods of 48 half-hour periods in a day, and 336 half hour periods in a week (i.e. 336 / 48 = 7).

    x <- msts(taylor, seasonal.periods=c(48,336), ts.frequency=48, start=2000+22/52)
    fit <- tbats(x)
    fc <- forecast(fit)
    
    # not shown, but the forecast seems to capture both seasonal patterns
    plot(fc)
    

    Also see http://users.ox.ac.uk/~mast0315/CompareUnivLoad.pdf for additional info on taylor

    For your data set with daily data and a daily/monthly seasonal pattern, perhaps

    tsdat <- msts(dat, seasonal.periods=c(7, 84), ts.frequency=7, start=2010)
    

    Or

     tsdat <- msts(dat, seasonal.periods=c(7, 365.25), ts.frequency=7, start=2010)
    

    EDIT

    Using the provided data, looks like a decent forecast with daily/weekly seasonality.

    data <- read.table("r_wikipedia-googletrends-model.csv", header=TRUE, sep=";")
    dengue_index <- msts(data$dengue_index, seasonal.periods=c(7, 365), ts.frequency=7)
    fit <- tbats(dengue_index)
    
    fc <- forecast(fit)
    plot(fc)
    

    enter image description here