Search code examples
pythonpython-3.xpandasplotcorrelation

How to make a correlation plot with a certain lag of two time series


I am trying to plot the autocorrelation between two Time Series in search for a needed lag. Python statsmodels.graphics.tsaplots library offers a plot_acf for investigation of the lagged impact of Time Series on itself.

How could I plot this lagged correlation to explore one Time Series impacting another Time Series to understand which lag I should choose?


Solution

  • https://stackoverflow.com/users/7094244/michael-grogan thank you for the explanation of "autocorrelation" and "crosscorrelation". I would rather suggest converting your plot image in more "statistical". For example like this one I made:

    plt.xcorr(TS1, TS2, usevlines=True, maxlags=20, normed=True, lw=2)
    plt.grid(True)
    plt.axhline(0.2, color='blue', linestyle='dashed', lw=2)
    plt.ylim([0, 0.3])
    plt.title("Cross-correlation")
    

    Cross-correlation plot image

    As you could find from the plot, I have a very special case with almost no correlation. Ideally, you should rewrite

    plt.set_ylim([0, 0.3])
    

    as

    plt.set_ylim([0, 1]) 
    

    to see a all correlation bounds. And, normaly, correlation of >=0.2 is considered to be statistically significant.