I am trying to investigate the cross correlation of two DataFrames. The code is given here:
df1 = pd.DataFrame({"A":[1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1]})
df2 = pd.DataFrame({"A":[7191, 7275, 9889, 9934, 9633, 9924, 9650, 9341, 8820, 8784, 8869]})
np.correlate(df1, df2)
But I get this error:
https://i.sstatic.net/XedfI.jpg
Any ideas?
You're getting this error as you're passing as a dataframe, which is 2D. np.correlate
is for cross-correlation of two 1-dimensional sequences. So try.
np.correlate(df1.squeeze(), df2.squeeze())
which outputs array([80556], dtype=int64)
.
Based on your suggestion, try
# You will need to change your column names, like
df1 = pd.DataFrame({"A":[1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1]})
df2 = pd.DataFrame({"B":[7191, 7275, 9889, 9934, 9633, 9924, 9650, 9341, 8820, 8784, 8869]})
df1.join(df2).corr()
which outputs
A B
A 1.000000 -0.174287
B -0.174287 1.000000
As suggested by piRSquared in the comments, you can also use df1.corrwith(df2)
to return a single value.