I want to find the covariance of a 10304*280 matrix (i.e 280 variable and each have 10304 subjects) and I am using the following numpy function to find this.
cov = numpy.cov(matrix)
I am expected 208*280 matrix as a result but it returned 10304*10304 matrix.
As suggested in the previous answer, you can change your memory layout. An easy way to do this in 2d is simply transposing the matrix:
import numpy as np
r = np.random.rand(100, 10)
np.cov(r).shape # is (100,100)
np.cov(r.T).shape # is (10,10)
But you can also specify a rowvar
flag. Read about it here:
import numpy as np
r = np.random.rand(100, 10)
np.cov(r).shape # is (100,100)
np.cov(r, rowvar=False).shape # is (10,10)
I think especially for large matrices this might be more performant, since you avoid the swapping/transposing of axes.
UPDATE:
I thought about this and wondered if the algorithm is actually different depending on rowvar == True
or rowvar == False
. Well, as it turns out, if you change the rowvar
flag, numpy simply transposes the array itself :P.
Look here.
So, in terms of performance, nothing will change between the two versions.