I have a large coordinate grid (vectors a and b), for which I generate and solve a matrix (10x10) equation. Is there a way for scipy.linalg.solve
to accept vector input? So far my solution was to run for cycles over the coordinate arrays.
import time
import numpy as np
import scipy.linalg
N = 10
a = np.linspace(0, 1, 10**3)
b = np.linspace(0, 1, 2*10**3)
A = np.random.random((N, N)) # input matrix, not static
def f(a,b,n): # array-filling function
return a*b*n
def sol(x,y): # matrix solver
D = np.arange(0,N)
B = f(x,y,D)**2 + f(x-1, y+1, D) # source vector
X = scipy.linalg.solve(A,B)
return X # output an N-size vector
start = time.time()
answer = np.zeros(shape=(a.size, b.size)) # predefine output array
for egg in range(a.size): # an ugly double-for cycle
for ham in range(b.size):
aa = a[egg]
bb = b[ham]
answer[egg,ham] = sol(aa,bb)[0]
print time.time() - start
@yevgeniy You are right, efficiently solving multiple independent linear systems A x = b with scipy a bit tricky (assuming an A array that changes for every iteration).
For instance, here is a benchmark for solving 1000 systems of the form, A x = b, where A is a 10x10
matrix, and b
a 10
element vector. Surprisingly, the approach to put all this into one block diagonal matrix and call scipy.linalg.solve
once is indeed slower both with dense and sparse matrices.
import numpy as np
from scipy.linalg import block_diag, solve
from scipy.sparse import block_diag as sp_block_diag
from scipy.sparse.linalg import spsolve
N = 10
M = 1000 # number of coordinates
Ai = np.random.randn(N, N) # we can compute the inverse here,
# but let's assume that Ai are different matrices in the for loop loop
bi = np.random.randn(N)
%timeit [solve(Ai, bi) for el in range(M)]
# 10 loops, best of 3: 32.1 ms per loop
Afull = sp_block_diag([Ai]*M, format='csr')
bfull = np.tile(bi, M)
%timeit Afull = sp_block_diag([Ai]*M, format='csr')
%timeit spsolve(Afull, bfull)
# 1 loops, best of 3: 303 ms per loop
# 100 loops, best of 3: 5.55 ms per loop
Afull = block_diag(*[Ai]*M)
%timeit Afull = block_diag(*[Ai]*M)
%timeit solve(Afull, bfull)
# 100 loops, best of 3: 14.1 ms per loop
# 1 loops, best of 3: 23.6 s per loop
The solution of the linear system, with sparse arrays is faster, but the time to create this block diagonal array is actually very slow. As to dense arrays, they are simply slower in this case (and take lots of RAM).
Maybe I'm missing something about how to make this work efficiently with sparse arrays, but if you are keeping the for
loops, there are two things that you could do for optimizations.
From pure python, look at the source code of scipy.linalg.solve
: remove unnecessary tests and factorize all repeated operations outside of your loops. For instance, assuming your arrays are not symmetrical positives, we could do
from scipy.linalg import get_lapack_funcs
gesv, = get_lapack_funcs(('gesv',), (Ai, bi))
def solve_opt(A, b, gesv=gesv):
# not sure if copying A and B is necessary, but just in case (faster if arrays are not copied)
lu, piv, x, info = gesv(A.copy(), b.copy(), overwrite_a=False, overwrite_b=False)
if info == 0:
return x
if info > 0:
raise LinAlgError("singular matrix")
raise ValueError('illegal value in %d-th argument of internal gesv|posv' % -info)
%timeit [solve(Ai, bi) for el in range(M)]
%timeit [solve_opt(Ai, bi) for el in range(M)]
# 10 loops, best of 3: 30.1 ms per loop
# 100 loops, best of 3: 3.77 ms per loop
which results in a 6.5x speed up.
If you need even better performance, you would have to port this for loop in Cython and interface the gesv
BLAS functions directly in C, as discussed here, or better with the Cython API for BLAS/LAPACK in Scipy 0.16.
Edit: As @Eelco Hoogendoorn mentioned if your A matrix is fixed, there is a much simpler and more efficient approach.