Search code examples
pythontime-seriesstatsmodelstimeserieschart

Seasonal decompose returns NaN values for Residual and Trend components


I am using seasonal decompose from stats models library (Link below) in Python to decompose a time series.

https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html

Below is the code I am executing to get plots for seasonal_decompose. I am also sharing a screenshot below to depict the plots that I get. Please note that all three components are showing up in the plots - Trend, Seasonaly and Residual.

Here is sample dataset. The dataset has 144 rows in total until 12/31/2014.

Month+Year Sales Month Year
1/31/2003 141 1 2003
2/28/2003 157 2 2003
3/31/2003 185 3 2003

s_d_multi = seasonal_decompose(df['Sales'], model='multiplicative')
s_d_multi.plot()
plt.show()

plot for the results of seasonal_decompose

If I use .seasonal , I get the values as shown below -

print(s_d_multi.seasonal)
Month+Year
2003-01-31    0.823333
2003-02-28    0.843859
2003-03-31    1.012370
2003-04-30    1.080556
2003-05-31    1.185793
                ...   
2014-08-31    1.230135
2014-09-30    0.961964
2014-10-31    0.836410
2014-11-30    0.765533
2014-12-31    0.903109
Name: seasonal, Length: 144, dtype: float64

If I use .resid, I get NaN values as shown below -

s_d_multi.resid
Month+Year
2003-01-31   NaN
2003-02-28   NaN
2003-03-31   NaN
2003-04-30   NaN
2003-05-31   NaN
              ..
2014-08-31   NaN
2014-09-30   NaN
2014-10-31   NaN
2014-11-30   NaN
2014-12-31   NaN
Name: resid, Length: 144, dtype: float64

Similarly, if I try using .trend I get NaN values.

s_d_multi.trend
Month+Year
2003-01-31   NaN
2003-02-28   NaN
2003-03-31   NaN
2003-04-30   NaN
2003-05-31   NaN
              ..
2014-08-31   NaN
2014-09-30   NaN
2014-10-31   NaN
2014-11-30   NaN
2014-12-31   NaN
Name: trend, Length: 144, dtype: float64

I should get values for .resid as well as for .trend like I am getting values for .seasonal.


Solution

  • The NaN values at the beginning and end of trend and residual is expected behavior for seasonal_decompose in statsmodels which is implementing Naive Decomposition (aka Classical Decomposition) as noted in the linked statsmodels documentation.

    Naive Decomposition will not provide a trend estimate for initial and final observations depending on the seasonality of the time series. Without a trend estimate, the residual cannot be calculated.

    A better explanation can be found in Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos. In particular, take a look at the "Comments on classical decomposition" section which states:

    The estimate of the trend-cycle is unavailable for the first few and last few observations. For example, if m=12, there is no trend-cycle estimate for the first six or the last six observations. Consequently, there is also no estimate of the remainder component for the same time periods.

    You may want to take a look at a more advanced decomposition model like Season-Trend decomposition using LOESS (STL) which will provide trend estimates for the initial and last observations.