Search code examples
pythonpandasnumpydataframecovariance

Diagonal element for covariance matrix not 1 pandas/numpy


I have the following dataframe:

   A  B
0  1  5
1  2  6
2  3  7
3  4  8

I wish to calculate the covariance

a = df.iloc[:,0].values

b = df.iloc[:,1].values

Using numpy for cov as :

numpy.cov(a,b)

I get:

array([[ 1.66666667,  1.66666667],
   [ 1.66666667,  1.66666667]])

Shouldn't the diagonal elements be 1? How do I get the diagonal elements to 1?


Solution

  • No they shouldn't. I think you might be confusing it with Correlation. Correlation and Covariance are different.

    What you see in the diagonals is simply the variance of the variables! Wiki screenshot for the formulas -

    enter image description here

    Wiki Link