I am testing some codes from online tutorials and i have problems reproducing the results regarding 'statsmodels' and 'plot_acf' and 'plot_pacf'.
For exemple for this exemple . Using exactly the same code i obtain this
Another exemple . Using the same code i obtain this
Its always a maximum of 20. Is it a default value for a parameter ?
Both codes not specify any other parameter exept the time-series values : plot_acf(series)
When i try to specify a number of lags, it works to a certain value, if i increase lags beyond a certain value i have the error:
"Can only compute partial correlations for lags up to 50% of the sample size."
Can anyone explain to me how can i manage to reproduce the same results.
I am using statsmodels version: 0.12.2
The code is simple:
from pandas import read_csv
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.graphics.tsaplots import plot_pacf
from matplotlib import pyplot
series = read_csv('stationary.csv', header=None, index_col=0, parse_dates=True, squeeze=True)
print(series)
pyplot.figure()
pyplot.subplot(211)
plot_acf(series,ax=pyplot.gca())
pyplot.subplot(212)
plot_pacf(series, ax=pyplot.gca())
pyplot.show()
I ran into similar problems and I got this for the "lags" in their documents:
If not provided, lags=np.arange(len(corr)) is used.
I have no ideas what this "corr" refers to as I cant find from the doc page (it may refers to the correlations in vertical Y axis?):
In my case of 36xx rows of data, the default lags give me 1%, ie. 36 and I tried a sample of 2000 rows, the same default 36 is given.
After read this official Github thread: https://github.com/statsmodels/statsmodels/issues/4663 it looks like the author had made a sensible change in newer version that leads to your scenario.
In your examples provided, both of your hands-on output indeed infered that those truncated / after the far right data point are *statistically insignificant * (far below the shaded boundary) so you should not worry about getting the exact replicate.
I also noted that the input parameters/defaults use for statsmodels change slightly over time.