I am trying to experiment with HoltWinters using some random data. However, using the statsmodel api I am unable to prediction for the next X data points.
Here is my sample code. I am unable to understand the predict API and what it means by start
and end
.
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
data = np.linspace(start=15, stop=25, num=100)
noise = np.random.uniform(0, 1, 100)
data = data + noise
split = int(len(data)*0.7)
data_train = data[0:split]
data_test = data[-(len(data) - split):]
model = ExponentialSmoothing(data_train)
model_fit = model.fit()
# make prediction
pred = model_fit.predict(split+1, len(data))
test_index = [i for i in range(split, len(data))]
plt.plot(data_train, label='Train')
plt.plot(test_index, data_test, label='Test')
plt.plot(test_index, pred, label='Prediction')
plt.legend(loc='best')
plt.show()
I get a weird graph for prediction and I believe it has something to do with my understanding of the predict
API.
The exponential smoothing model you've chosen doesn't include a trend, so it is forecasting the best level, and that gives a horizontal line forecast.
If you do:
model = ExponentialSmoothing(data_train, trend='add')
then you will get a trend, and likely it will look more like you expect.
For example:
# Simulate some data
np.random.seed(12346)
dta = pd.Series(np.arange(100) + np.sin(np.arange(100)) * 5 + np.random.normal(scale=4, size=100))
# Perform exponention smoothing, no trend
mod1 = sm.tsa.ExponentialSmoothing(dta)
res1 = mod1.fit()
fcast1 = res1.forecast(30)
plt.plot(dta)
plt.plot(fcast1, label='Model without trend')
# Perform exponention smoothing, with a trend
mod2 = sm.tsa.ExponentialSmoothing(dta, trend='add')
res2 = mod2.fit()
fcast2 = res2.forecast(30)
plt.plot(fcast2, label='Model with trend')
plt.legend(loc='lower right')
gives the following: