Search code examples
python-3.xnumpymatrixscipysparse-matrix

Split a sparse square matrix into blocks


I have a banded sparse square matrix , A, of type <class 'scipy.sparse.csr.csr_matrix'> and size = 400 x 400. I'd like to split this into block square matrices of size 200 x 200 each. For instance, the first block

block1 = A[0:200, 0:200]
block2 = A[100:300, 100:300]
block3 = A[200:400, 200:400]

The same information about the slices is stored in a list of tuples.

 [(0,200), (100, 300), (200, 400)]

Suggestions on how to split the spare square matrix will be really helpful.


Solution

  • You can convert to a regular array and then split it:

    from scipy.sparse import csr_matrix
    import numpy as np
    
    row = np.arange(400)[::2]
    col = np.arange(400)[1::2]
    data = np.random.randint(1, 10, (200))
    compressed_matrix = csr_matrix((data, (row, col)), shape=(400, 400))
    
    # Convert to a regular array
    m = compressed_matrix.toarray()
    # Split the matrix
    sl = [(0,200), (100, 300), (200, 400)]
    blocks = [m[i, i] for i in map(lambda x: slice(*x), sl)]
    

    And if you want you can convert back each block to a compressed matrix:

    blocks_csr = list(map(csr_matrix, blocks))
    

    CODE EXPLANATION

    The creation of the blocks is based on a list comprehension and basic slicing. Each input tuple is converted to a slice object, only to create a series of row and column indexes, corresponding to that of the elements to be selected; in this answer, this is sufficient to select the requested block squared matrix. Slice objects are generated when extended indexing syntax is used: To be clear, a[start:stop:step] will create a slice object equivalent to slice(start, stop, step). In our case, they are used to dynamically change the indexes to be selected, according to the matrix we want to extract. So, if you consider the first block, m[i, i] is equivalent to m[0:200, 0:200].

    Slice objects are a form of basic indexing, so a view of the original array is created, rather than a copy (this means that if you modify the view, also the original array will be modified: you can easily create a copy of the original data using the copy method of the numpy array).

    The map object is used to generate slice objects from the input tuples; map applies the function provided as its first argument to all the elements of its second argument. lambda is used to create an anonymous function, i.e., a function defined without a name. Anonymous functions are useful to accomplish specific tasks that you do not want to code in a standard function, because you are not going to reuse them or you need only for a short period of time, like in the example of this code. They make code more compact rather than defining the correspondent functions.

    *x is called unpacking, i.e you extract, unpack elements from the tuple. Suppose you have a function f and a tuple a = (1, 2, 3), then f(*a) is equivalent to f(1, 2, 3) (as you can see, you can think of unpacking as removing a level of parentheses).

    So, looking back at the code:

    blocks = [  # this is a list comprehension
        m[i, i]  # basic slicing of the input array
        for i in map(  # map apply a function to all the item of a list
            lambda x: slice(*x), sl  # creating a slice object out of the provided index ranges
        )
    ]