python numpy scipy linear-algebra numerical-methods

Parallel exact matrix diagonalization with Python

Is anyone aware of an implemented version (perhaps using scipy/numpy) of parallel exact matrix diagonalization (equivalently, finding the eigensystem)? If it helps, my matrices are symmetric and sparse. I would hate to spend a day reinventing the wheel.

EDIT:

My matrices are at least 10,000x10,000 (but, preferably, at least 20 times larger). For now, I only have access to a 4-core Intel machine (with hyperthreading, so 2 processes per core), ~3.0Ghz each with 12GB of RAM. I may later have access to a 128-core node ~3.6Ghz/core with 256GB of RAM, so single machine/multiple cores should do it (for my other parallel tasks, I have been using multiprocessing). I would prefer for the algorithms to scale well.

I do need exact diagonalization, so scipy.sparse routines are not be good for me (tried, didn't work well). I have been using numpy.linalg.eigh (I see only single core doing all the computations).

Alternatively (to the original question): is there an online resource where I can find out more about compiling SciPy so as to insure parallel execution?

Solution

For symmetric sparse matrix eigenvalue/eigenvector finding, you may use scipy.sparse.linalg.eigsh. It uses ARPACK behind the scenes, and there are parallel ARPACK implementations. AFAIK, SciPy can be compiled with one if your scipy installation uses the serial version.

However, this is not a good answer, if you need all eigenvalues and eigenvectors for the matrix, as the sparse version uses the Lanczos algorithm.

If your matrix is not overwhelmingly large, then just use numpy.linalg.eigh. It uses LAPACK or BLAS and may use parallel code internally.

If you end up rolling your own, please note that SciPy/NumPy does all the heavy lifting with different highly optimized linear algebra packages, not in pure Python. Due to this the performance and degree of parallelism depends heavily on the libraries your SciPy/NumPy installation is compiled with.

(Your question does not reveal if you just want to have parallel code running on several processors, or on several computers. Also, the size of your matrix has a big impact on the best method. So, this answer may be completely off-the-mark.)