I am using seasonal decompose from stats models library (Link below) in Python to decompose a time series.
https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html
Below is the code I am executing to get plots for seasonal_decompose. I am also sharing a screenshot below to depict the plots that I get. Please note that all three components are showing up in the plots - Trend, Seasonaly and Residual.
Here is sample dataset. The dataset has 144 rows in total until 12/31/2014.
Month+Year | Sales | Month | Year |
---|---|---|---|
1/31/2003 | 141 | 1 | 2003 |
2/28/2003 | 157 | 2 | 2003 |
3/31/2003 | 185 | 3 | 2003 |
s_d_multi = seasonal_decompose(df['Sales'], model='multiplicative')
s_d_multi.plot()
plt.show()
If I use .seasonal
, I get the values as shown below -
print(s_d_multi.seasonal)
Month+Year
2003-01-31 0.823333
2003-02-28 0.843859
2003-03-31 1.012370
2003-04-30 1.080556
2003-05-31 1.185793
...
2014-08-31 1.230135
2014-09-30 0.961964
2014-10-31 0.836410
2014-11-30 0.765533
2014-12-31 0.903109
Name: seasonal, Length: 144, dtype: float64
If I use .resid
, I get NaN values as shown below -
s_d_multi.resid
Month+Year
2003-01-31 NaN
2003-02-28 NaN
2003-03-31 NaN
2003-04-30 NaN
2003-05-31 NaN
..
2014-08-31 NaN
2014-09-30 NaN
2014-10-31 NaN
2014-11-30 NaN
2014-12-31 NaN
Name: resid, Length: 144, dtype: float64
Similarly, if I try using .trend
I get NaN values.
s_d_multi.trend
Month+Year
2003-01-31 NaN
2003-02-28 NaN
2003-03-31 NaN
2003-04-30 NaN
2003-05-31 NaN
..
2014-08-31 NaN
2014-09-30 NaN
2014-10-31 NaN
2014-11-30 NaN
2014-12-31 NaN
Name: trend, Length: 144, dtype: float64
I should get values for .resid
as well as for .trend
like I am getting values for .seasonal
.
The NaN
values at the beginning and end of trend
and residual
is expected behavior for seasonal_decompose
in statsmodels
which is implementing Naive Decomposition (aka Classical Decomposition) as noted in the linked statsmodels
documentation.
Naive Decomposition will not provide a trend estimate for initial and final observations depending on the seasonality of the time series. Without a trend estimate, the residual cannot be calculated.
A better explanation can be found in Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos. In particular, take a look at the "Comments on classical decomposition" section which states:
The estimate of the trend-cycle is unavailable for the first few and last few observations. For example, if m=12, there is no trend-cycle estimate for the first six or the last six observations. Consequently, there is also no estimate of the remainder component for the same time periods.
You may want to take a look at a more advanced decomposition model like Season-Trend decomposition using LOESS (STL) which will provide trend estimates for the initial and last observations.