Search code examples
pythonnumpyscipy

Efficient matrix square root of large symmetric positive semidefinite matrix in Python


I have a large (150,000 x 150,000) symmetric positive semidefinite sample covariance matrix whose matrix square root I wish to compute efficiently in Python.

Is there any way to speed up the square root computation given that the matrix is symmetric psd? scipy.linalg.sqrtm is slow for my purpose.


Solution

  • Depending on your application, if it is enough to find Bs @ Bs.T ~ B, you can use Cholesky decomposition. If not you could get the square root based on eigenvalue decomposition.

    import numpy as np;
    import scipy.linalg
    A = np.random.randn(1500, 1500)
    
    
    %%time
    Bs = scipy.linalg.sqrtm(B)
    

    Wall time: 4.4 s - Our baseline

    %%time
    Bs = scipy.linalg.cholesky(B)
    

    Wall time: 52 ms Cholesky is much faster

    D, V = scipy.linalg.eigh(B)
    Bs = (V * np.sqrt(D)) @ V.T
    

    Wall time: 1.62 s more than twice faster (it explores symmetry)

    Using pytorch

    Pytorch have supports some linear algebra functions, and they vectorize accross multiple CPUs

    import torch.linalg
    B_cpu = torch.tensor(B, device='cpu')
    

    Square root using eigh (12 logic / 6 physical CPUs)

    %%time
    D, V = torch.linalg.eigh(B_cpu)
    Bs = (V * torch.sqrt(D)) @ V.T
    

    Wall time: 400 ms

    Or Cholesky decomposition

    Bs = torch.linalg.cholesky(B_cpu)
    

    Wall time: 27 ms