Search code examples
rmatrixlinear-algebramatrix-multiplicationmatrix-inverse

Large matrix inversion in R error after 250x250


I receive the following error when trying to perform matrix inversion on a matrix that is larger than 250x250. I only receive the error if the size of the matrix exceeds this size.

Error in solve.default(S) : 
  system is computationally singular: reciprocal condition number = 2.10729e-20

I've tried other matrices larger than this that do invert. I've checked for multicollinearity in the matrix and there is none. What could cause this error?

Edit: changing the tolerance does prevent the error. But why does the error get thrown only when the matrix is larger than 250x250?


Solution

  • There is no problem with calculating matrices of size greater than 250:

    set.seed(12345)
    N = 300
    m <- matrix(rnorm(N*N), nrow = N)
    str(m)
    # num [1:300, 1:300] 0.586 0.709 -0.109 -0.453 0.606 ...
    
    m.inv <- solve(m)
    str(m.inv)
    # num [1:300, 1:300] 0.0274 -0.0164 0.0384 -0.0936 -0.1086 ...
    

    However if the determinant of matrix is 0 (or almost zero), then obviously there will be an error calculating an inverse:

    p <- matrix(7, nrow = N, ncol=N)
    str(p)
    # num [1:300, 1:300] 7 7 7 7 7 7 7 7 7 7 ...
    
    p.inv <- solve(p)
    #Error in solve.default(p) : 
    #  Lapack routine dgesv: system is exactly singular: U[2,2] = 0
    

    In your case it looks like you are operating with very small values. Try to specify a smaller tolerance:

    solve(..., tol = 1e-17)
    
    # You can check the current tollerance on your system:
    .Machine$double.eps
    #[1] 2.220446e-16
    

    A better approach is probably to calculate determinant first and then handle the cases where it is very small:

    det(p)
    #[1] 0
    

    To answer your question why the error occurs only when your matrix is larger than 250 elements, I would recommend that you calculate the determinant for your 250x250 matrix and then compute the determinant for a larger matrix and compare the values. The second value is probably smaller than the tolerance, while the first one is not.