I have a large (150,000 x 150,000) symmetric positive semidefinite sample covariance matrix whose matrix square root I wish to compute efficiently in Python.
Is there any way to speed up the square root computation given that the matrix is symmetric psd? scipy.linalg.sqrtm is slow for my purpose.
Depending on your application, if it is enough to find Bs @ Bs.T ~ B
, you can use Cholesky decomposition. If not you could get the square root based on eigenvalue decomposition.
import numpy as np;
import scipy.linalg
A = np.random.randn(1500, 1500)
%%time
Bs = scipy.linalg.sqrtm(B)
Wall time: 4.4 s
- Our baseline
%%time
Bs = scipy.linalg.cholesky(B)
Wall time: 52 ms
Cholesky is much faster
D, V = scipy.linalg.eigh(B)
Bs = (V * np.sqrt(D)) @ V.T
Wall time: 1.62 s
more than twice faster (it explores symmetry)
Pytorch have supports some linear algebra functions, and they vectorize accross multiple CPUs
import torch.linalg
B_cpu = torch.tensor(B, device='cpu')
Square root using eigh (12 logic / 6 physical CPUs)
%%time
D, V = torch.linalg.eigh(B_cpu)
Bs = (V * torch.sqrt(D)) @ V.T
Wall time: 400 ms
Or Cholesky decomposition
Bs = torch.linalg.cholesky(B_cpu)
Wall time: 27 ms