How to get the blocks back from a scipy sparse block matrix?

After some vectorized calculations, I get a sparse block matrix with all my results stacked in blocks of same size.

>>> A = [[1, 1],
...      [1, 1]]
>>> B = [[2, 2],
...      [2, 2]]
>>> C = [[3, 3],
...      [3, 3]]
>>> results = scipy.sparse.block_diag(A, B, C)
>>> print(results.toarray())
[[1 1 0 0 0 0]
 [1 1 0 0 0 0]
 [0 0 2 2 0 0]
 [0 0 2 2 0 0]
 [0 0 0 0 3 3]
 [0 0 0 0 3 3]]

How can I get back these arrays A,B,C in an efficient way, if necessery by providing their shape (2,2)?

Solution

In [177]: >>> A = [[1, 1],
     ...: ...      [1, 1]]
     ...: >>> B = [[2, 2],
     ...: ...      [2, 2]]
     ...: >>> C = [[3, 3],
     ...: ...      [3, 3]]
     ...: >>> results = sparse.block_diag([A, B, C])
     ...:      
In [178]: results
Out[178]: 
<6x6 sparse matrix of type '<class 'numpy.int64'>'
    with 12 stored elements in COOrdinate format>

block_diag does not preserve the inputs; rather it creates coo format matrix, representing the whole matrix, not the pieces.

In [194]: results.data
Out[194]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3], dtype=int64)
In [195]: results.row
Out[195]: array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], dtype=int32)
In [196]: results.col
Out[196]: array([0, 1, 0, 1, 2, 3, 2, 3, 4, 5, 4, 5], dtype=int32)


In [179]: results.A
Out[179]: 
array([[1, 1, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0],
       [0, 0, 2, 2, 0, 0],
       [0, 0, 2, 2, 0, 0],
       [0, 0, 0, 0, 3, 3],
       [0, 0, 0, 0, 3, 3]], dtype=int64)

block_diag pass the arrays to sparse.bmat. That in turn makes a coo matrix from each, and then merges the coo attributes into 3 arrays, which are inputs to the global sparse matrix.

There is another sparse format bsr that may preserve the blocks (until conversion to csr for calculation), but I'll have to experiment to see that's the case.

Let's make a bsr from that results coo:

In [186]: bresults = sparse.bsr_matrix(results)
In [187]: bresults
Out[187]: 
<6x6 sparse matrix of type '<class 'numpy.int64'>'
    with 12 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [188]: bresults.blocksize
Out[188]: (2, 2)
In [189]: bresults.data
Out[189]: 
array([[[1, 1],
        [1, 1]],

       [[2, 2],
        [2, 2]],

       [[3, 3],
        [3, 3]]], dtype=int64)

So it deduces that there are blocks, just as you desired.

In [191]: bresults.indices
Out[191]: array([0, 1, 2], dtype=int32)
In [192]: bresults.indptr
Out[192]: array([0, 1, 2, 3], dtype=int32)

So it's a csr like storage, but with the data arranged in blocks.

It may be possible to construct this from your A,B,C without the block_diag intermediary, but I'd have to look at the docs more.