Search code examples
pythonnumpylinear-algebraeigenvector

NumPy eigh() gives incorrect eigenvectors


I have been using NumPy to do some linear algebra, but am having issues with eigh() seemingly returning incorrect eigenvectors. Here is the symmetric matrix (481 by 481) I am working with. Look at the following code:

import numpy as np

c = np.load('C.npy')
eigenvectors_h = np.linalg.eigh(c)[1]
c.dot(eigenvectors_h[:, 0]) / eigenvectors_h[:, 0]

c is real and is definitely symmetric because np.allclose(c, c.T) returns True.

Rather than get an array consisting of the eigenvalue repeated 481 times, I instead see what look like random numbers:

-1.03913672e+04   1.70145524e-15  -6.30342977e-16  -5.80181781e-15
2.43586988e-15   1.02142854e-13  -1.59100453e-13   6.34768223e-14
1.47793139e-14   5.19832084e-15   1.23786159e-14  -6.89332604e-14
3.35120474e-14  -5.55497774e-14   3.35912194e-14   2.41599711e-14
-8.10853045e-15   3.39728566e-14   1.66605768e-14  -4.62536283e-14
4.78158644e-15   1.05840004e-14   2.49968277e-14   6.37161752e-14
-8.29049452e-15   7.69540224e-13  -1.41737099e-14  -2.04904903e-14
-2.02649833e-14   8.75614182e-15  -6.43718154e-14  -6.71884701e-15
2.89567850e-15  -1.93639427e-14   2.05961830e-15  -2.82825321e-14
-2.99156760e-14  -1.17149803e-14  -1.00413883e-13   2.81365540e-15
-1.47420105e-14  -1.54638301e-14   2.97770540e-14   9.42616109e-14
-2.18819275e-13   6.23156105e-13  -1.14148906e-14   1.97147438e-14
-5.04103330e-14  -1.39415849e-13  -6.78032159e-14  -2.18085326e-14
-1.36511305e-14   3.09529929e-14  -9.42922227e-15  -1.80013713e-14
-3.34989389e-14  -1.31399506e-15   3.86915434e-14   7.43430828e-15
3.00681063e-14   3.15599077e-15   1.77696288e-14  -5.33198194e-15
-3.03304561e-14   1.74254787e-13  -3.31735946e-14   3.02816694e-14
-1.79965325e-14  -6.03794643e-13  -2.70964350e-14   1.13476801e-14
-1.31756179e-14   1.23490868e-14   1.31665400e-14  -4.82723158e-15
-6.80123679e-14   1.01616264e-13  -3.26409876e-14   5.00897081e-15
8.01025834e-14  -5.50605097e-13   3.03059609e-15   5.55524078e-14
1.76830600e-14   5.82590991e-14  -4.07471685e-14  -1.74936332e-14
3.67006376e-14   1.23210295e-14   4.54025070e-14  -1.37452957e-13
5.68932273e-15   3.59057575e-14   7.52501521e-15  -8.21274824e-15
2.86056181e-13   2.12632482e-14   9.50056403e-14   2.80131245e-14
...

However, it works fine with eig():

eigenvectors = np.linalg.eig(c)[1]
c.dot(eigenvectors[:, 0]) / eigenvectors[:, 0] #[126.56622721] * 481

I would like to use eigh() because it seems faster and returns real values rather than complex values, but it doesn't seem to work. Can anyone explain why this is or what I am doing wrong?

Self-contained example

This example demonstrates the same behaviour within self-contained code (using a random matrix that is essentially of rank 1, except for numerical artifacts).

import numpy as np
u = np.random.uniform(size=(100,))
c = np.outer(u, u)

eigenvectors_h = np.linalg.eigh(c)[1]
print(c.dot(eigenvectors_h[:, 0]) / eigenvectors_h[:, 0])

eigenvectors = np.linalg.eig(c)[1]
print(c.dot(eigenvectors[:, 0]) / eigenvectors[:, 0])

Solution

  • There is nothing wrong with eigh, anymore than with eig. You will get similarly "random" numbers if you test eig with an eigenvector from the end of the list: try

    print(c.dot(eigenvectors[:, -10]) / eigenvectors[:, -10])
    

    Here is what is going on:

    1. Most of the eigenvalues of your matrix are indistinguishable from 0, they are of the size 1e-16 or less.
    2. The method eigh returns smallest eigenvalues (and their eigenvectors) first.
    3. The method eig returns largest eigenvalues (and their eigenvectors) first.
    4. You are testing only the first eigenvector returned.

    So you see a difference in the behavior of eig and eigh that isn't really there.

    In practical terms, your matrix appears to be low rank + numerical noise. So the smallish eigenvalues should probably be 0, in which case the computation of eigenvectors is basically a lottery; there's a huge linear subspace that is very nearly killed by the matrix, anything from that subspace can pass as an eigenvector. The result of multiplication c*eigenvector is just a bunch of numerical noise.

    Moral: Don't forget to look at the whole picture.

    eigenvalues_h, eigenvectors_h = np.linalg.eigh(c)
    print(eigenvalues_h)
    
    eigenvalues, eigenvectors = np.linalg.eig(c)
    print(eigenvalues)