Search code examples
rcorrelationdot-product

Why correlation calculated by 'cor' function in R differs from cosine of angle between vectors


I'm trying to demonstrate that the cosine of the angle between vectors is the same as the correlation coefficient between them. However, the cor function in R and the value I get from the dot product definition of the cosine of the angle are different.

data("anscombe")


x <- matrix(anscombe$x1)
y <- matrix(anscombe$y1)

cor(x,y)

(t(x)%*%y) / ((norm(x, type = 'f') * norm(y, type = 'f')))

If you can point to any mistake in my code or any thoughts on this issue is greatly appreciated.


Solution

  • I believe you have to center the vectors for the cosine of the angle between vectors to achieve the same answer. The Pearson correlation is cosine similarity between centered vectors so if you center the vectors and do the cosine similarity it produces the same answer.

    data("anscombe")
    
    
    x <- matrix(anscombe$x1)
    y <- matrix(anscombe$y1)
    z <- x - mean(x)
    aa <- y - mean(y)
    
    cor(x,y)
    
    
    (t(z)%*%aa) / ((norm(z, type = 'f') * norm(aa, type = 'f')))