Search code examples
linear-algebranumerical-methodsscientific-computingmatrix-inverse

Matrix Inversion Methods


When one has a problem of a matrix inverse multiplication with a vector, as such:

Ax=b

one can take a Cholesky Decomposition of A and backsubstitute b to find the resulting vector x. However, a matrix inverse is sometimes needed when the problem is not formulated as above. My question is what is the best way to handle such a situation. Below, I have compared various ways(using numpy) to invert a positive definite matrix:

Firstly, generate the matrix:

>>> A = np.random.rand(5,5)
>>> A
array([[ 0.13516074,  0.2532381 ,  0.61169708,  0.99678563,  0.32895589],
       [ 0.35303998,  0.8549499 ,  0.39071336,  0.32792806,  0.74723177],
       [ 0.4016188 ,  0.93897663,  0.92574706,  0.93468798,  0.90682809],
       [ 0.03181169,  0.35059435,  0.10857948,  0.36422977,  0.54525   ],
       [ 0.64871162,  0.37809219,  0.35742865,  0.7154568 ,  0.56028468]])
>>> A = np.dot(A.transpose(), A)
>>> A
array([[ 0.72604206,  0.96959581,  0.82773451,  1.10159817,  1.05327233],
       [ 0.96959581,  1.94261607,  1.53140854,  1.80864185,  1.9766411 ],
       [ 0.82773451,  1.53140854,  1.52338262,  1.89841402,  1.59213299],
       [ 1.10159817,  1.80864185,  1.89841402,  2.61930178,  2.01999385],
       [ 1.05327233,  1.9766411 ,  1.59213299,  2.01999385,  2.10012097]])

The results for the method of direct inversion are as follows:

>>> np.linalg.inv(A)
array([[  5.49746838,  -1.92540877,   2.24730018,  -2.20242449,
         -0.53025806],
       [ -1.92540877,  95.34219156, -67.93144606,  50.16450952,
        -85.52146331],
       [  2.24730018, -67.93144606,  57.0739859 , -40.56297863,
         58.55694127],
       [ -2.20242449,  50.16450952, -40.56297863,  30.6441555 ,
        -44.83400183],
       [ -0.53025806, -85.52146331,  58.55694127, -44.83400183,
         79.96573405]])

When using the Moore-Penrose Pseudoinverse, the results are as follows(you might notice that to the displayed precision, the results are the same as direct inversion):

>>> np.linalg.pinv(A)
array([[  5.49746838,  -1.92540877,   2.24730018,  -2.20242449,
         -0.53025806],
       [ -1.92540877,  95.34219156, -67.93144606,  50.16450952,
        -85.52146331],
       [  2.24730018, -67.93144606,  57.0739859 , -40.56297863,
         58.55694127],
       [ -2.20242449,  50.16450952, -40.56297863,  30.6441555 ,
        -44.83400183],
       [ -0.53025806, -85.52146331,  58.55694127, -44.83400183,
         79.96573405]])

Finally, when solving with the identity matrix:

>>> np.linalg.solve(A, np.eye(5))
array([[  5.49746838,  -1.92540877,   2.24730018,  -2.20242449,
         -0.53025806],
       [ -1.92540877,  95.34219156, -67.93144606,  50.16450952,
        -85.52146331],
       [  2.24730018, -67.93144606,  57.0739859 , -40.56297863,
         58.55694127],
       [ -2.20242449,  50.16450952, -40.56297863,  30.6441555 ,
        -44.83400183],
       [ -0.53025806, -85.52146331,  58.55694127, -44.83400183,
         79.96573405]])

Again, you might notice that on a cursory inspection, the result is the same as the previous two methods.

It is well known that matrix inversion is an ill posed problem due to numerical instability that should be avoided where possible. However, in situations where it appears unavoidable, what is the preferable approach and why? To clarify, I am referring to the best approach when implementing such equations in software.

An example of such a problem is provided with another of my questions.


Solution

  • The reason for avoiding inverting matrices has only to do with efficiency. It is faster to solve the linear systems directly. If you think of the problem in your linked question a bit differently, you can apply the same principles.

    In order to find the matrix inv(K) * Y * T(Y) * inv(K) - D * inv(K) you can solve the following systems of equations:

     K * R * K = Y * T(Y)
    

    You can solve it in two parts:

     R2 * K = R1
     K * R1 = Y * T(Y)
    

    So you first solve for R1 with your usual method, then solve for R2 (recognise that you can solve T(K) * T(R2) = T(R1) if you have to).

    However, at this point I don't know if this will be more efficient than computing the inverse explicitly unless K is symmetric. (There may be a way to efficiently get the decomposition of T(K) from K, but I don't know offhand)

    If K is symmetric then you can compute your decomposition on K once and reuse it for the two back-substitution steps and it might be more efficient than computing the inverse explicitly.