I have been using NumPy to do some linear algebra, but am having issues with eigh() seemingly returning incorrect eigenvectors. Here is the symmetric matrix (481 by 481) I am working with. Look at the following code:
import numpy as np
c = np.load('C.npy')
eigenvectors_h = np.linalg.eigh(c)[1]
c.dot(eigenvectors_h[:, 0]) / eigenvectors_h[:, 0]
c
is real and is definitely symmetric because np.allclose(c, c.T)
returns True
.
Rather than get an array consisting of the eigenvalue repeated 481 times, I instead see what look like random numbers:
-1.03913672e+04 1.70145524e-15 -6.30342977e-16 -5.80181781e-15
2.43586988e-15 1.02142854e-13 -1.59100453e-13 6.34768223e-14
1.47793139e-14 5.19832084e-15 1.23786159e-14 -6.89332604e-14
3.35120474e-14 -5.55497774e-14 3.35912194e-14 2.41599711e-14
-8.10853045e-15 3.39728566e-14 1.66605768e-14 -4.62536283e-14
4.78158644e-15 1.05840004e-14 2.49968277e-14 6.37161752e-14
-8.29049452e-15 7.69540224e-13 -1.41737099e-14 -2.04904903e-14
-2.02649833e-14 8.75614182e-15 -6.43718154e-14 -6.71884701e-15
2.89567850e-15 -1.93639427e-14 2.05961830e-15 -2.82825321e-14
-2.99156760e-14 -1.17149803e-14 -1.00413883e-13 2.81365540e-15
-1.47420105e-14 -1.54638301e-14 2.97770540e-14 9.42616109e-14
-2.18819275e-13 6.23156105e-13 -1.14148906e-14 1.97147438e-14
-5.04103330e-14 -1.39415849e-13 -6.78032159e-14 -2.18085326e-14
-1.36511305e-14 3.09529929e-14 -9.42922227e-15 -1.80013713e-14
-3.34989389e-14 -1.31399506e-15 3.86915434e-14 7.43430828e-15
3.00681063e-14 3.15599077e-15 1.77696288e-14 -5.33198194e-15
-3.03304561e-14 1.74254787e-13 -3.31735946e-14 3.02816694e-14
-1.79965325e-14 -6.03794643e-13 -2.70964350e-14 1.13476801e-14
-1.31756179e-14 1.23490868e-14 1.31665400e-14 -4.82723158e-15
-6.80123679e-14 1.01616264e-13 -3.26409876e-14 5.00897081e-15
8.01025834e-14 -5.50605097e-13 3.03059609e-15 5.55524078e-14
1.76830600e-14 5.82590991e-14 -4.07471685e-14 -1.74936332e-14
3.67006376e-14 1.23210295e-14 4.54025070e-14 -1.37452957e-13
5.68932273e-15 3.59057575e-14 7.52501521e-15 -8.21274824e-15
2.86056181e-13 2.12632482e-14 9.50056403e-14 2.80131245e-14
...
However, it works fine with eig()
:
eigenvectors = np.linalg.eig(c)[1]
c.dot(eigenvectors[:, 0]) / eigenvectors[:, 0] #[126.56622721] * 481
I would like to use eigh()
because it seems faster and returns real values rather than complex values, but it doesn't seem to work. Can anyone explain why this is or what I am doing wrong?
This example demonstrates the same behaviour within self-contained code (using a random matrix that is essentially of rank 1, except for numerical artifacts).
import numpy as np
u = np.random.uniform(size=(100,))
c = np.outer(u, u)
eigenvectors_h = np.linalg.eigh(c)[1]
print(c.dot(eigenvectors_h[:, 0]) / eigenvectors_h[:, 0])
eigenvectors = np.linalg.eig(c)[1]
print(c.dot(eigenvectors[:, 0]) / eigenvectors[:, 0])
There is nothing wrong with eigh
, anymore than with eig
. You will get similarly "random" numbers if you test eig
with an eigenvector from the end of the list: try
print(c.dot(eigenvectors[:, -10]) / eigenvectors[:, -10])
Here is what is going on:
eigh
returns smallest eigenvalues (and their eigenvectors) first.eig
returns largest eigenvalues (and their eigenvectors) first.So you see a difference in the behavior of eig
and eigh
that isn't really there.
In practical terms, your matrix appears to be low rank + numerical noise. So the smallish eigenvalues should probably be 0, in which case the computation of eigenvectors is basically a lottery; there's a huge linear subspace that is very nearly killed by the matrix, anything from that subspace can pass as an eigenvector. The result of multiplication c*eigenvector
is just a bunch of numerical noise.
Moral: Don't forget to look at the whole picture.
eigenvalues_h, eigenvectors_h = np.linalg.eigh(c)
print(eigenvalues_h)
eigenvalues, eigenvectors = np.linalg.eig(c)
print(eigenvalues)