I have a question about the way scipy builds block diagonal matrices. I was expecting that creating a sparse block diagonal matrix would be much quicker and more efficient than creating a dense one (because of sparsity compressions). But it turns out that it's not the case (or maybe am I using some inefficient method) :
from timeit import default_timer as timer
import numpy as np
from scipy.sparse import block_diag as bd_sp
from scipy.linalg import block_diag as bd_la
m = [np.identity(1)] * 10000
before = timer()
res = bd_sp(m)
timer()-before
#takes 33.79 secs
before = timer()
res = bd_la(*m)
timer()-before
#takes 0.069 secs
What am I missing? Thank's in advance for your replies.
In [625]: [np.identity(1)*i for i in range(1,5)]
Out[625]: [array([[1.]]), array([[2.]]), array([[3.]]), array([[4.]])]
In [626]: sparse.block_diag(_)
Out[626]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [627]: _.A
Out[627]:
array([[1., 0., 0., 0.],
[0., 2., 0., 0.],
[0., 0., 3., 0.],
[0., 0., 0., 4.]])
block_diag
uses bmat
to join the elements. bmat
makes coo
matrices from all elements, and combines their attributes with offsets, and makes a new coo
matrix. The code is readable Python.
It may be more efficient to construct your own data, row, col
arrays. block_diag
is a convenience, and fine for combining a few large matrices, but not efficient when combining many small ones.
The linalg
function is also Python (and pretty short). If creates an out
array of the right shape, and inserts the blocks with sliced indexing. That's an efficient dense array solution. Most of the hard work is done in compiled numpy
code.
Sparse matrices can be faster when doing matrix multiplication (and related linalg solvers). For most other operations, including initialization, they are slower than equivalent dense code. They are also valuable when the problem is too big.