Search code examples
pythonnumpymatrix-multiplicationcovariance

Can covariance of A be used to calculate A'*A?


I am doing a benchmarking test in python on different ways to calculate A'*A, with A being a N x M matrix. One of the fastest ways was to use numpy.dot().

I was curious if I can obtain the same result using numpy.cov() (which gives the covariance matrix) by somehow varying the weights or by somehow pre-processing the A matrix ? But I had no success. Does anyone know if there is any relation between the product A'*A and covariance of A, where A is a matrix with N rows/observations and M columns/variables?


Solution

  • Have a look at the cov source. Near the end of the function it does this:

    c = dot(X, X_T.conj())
    

    Which is basically the dot product you want to perform. However, there are all kinds of other operations: checking inputs, subtracting the mean, normalization, ...

    In short, np.cov will never ever be faster than np.dot(A.T, A) because internally it contains exactly that operation.

    That said - the covariance matrix is computed as

    enter image description here

    Or in Python:

    import numpy as np
    
    a = np.random.rand(10, 3)
    
    m = np.mean(a, axis=0, keepdims=True)
    x = np.dot((a - m).T, a - m) / (a.shape[0] - 1)
    
    y = np.cov(a.T)
    
    assert np.allclose(x, y)  # check they are equivalent
    

    As you can see, the covariance matrix is equivalent to the raw dot product if you subtract the mean of each variable and divide the result by the number of samples (minus one).