Search code examples
pythonscipysparse-matrix

Create sparse DIA matrix, and then change columns


I have two vectors t, and delta - both of length n. I want to create a sparse DIA matrix A with that vector, and then adjust the columns: For all i, I want to move the ith entry in A by delta[i] columns to the left.

An easy way to control columns is in the COO format. Here's what I thought would work:

from scipy.sparse import diags
A = diags([t], offsets=[-1]).tocoo()
A.col = A.col - delta

However, in my example A.nnz == len(A.col) is only 216, while the length of t and delta is 239. I don't understand how that happened, given that nnz stores "Number of stored values, including explicit zeros.".

How can I tackle this problem? Here's my example data:

from numpy import np
t = np.array([ 2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.655,
        2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.655,
        2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.655,  2.155,
        2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  2.155,
        2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  2.155,
        2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  2.155,  1.655,
        1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.655,
        1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.655,
        1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.655,  1.155,
        1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  1.155,
        1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  1.155,
        1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  1.155,  0.655,
        0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.655,
        0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.655,
        0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.655,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,
        0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.405,  0.   ,
        0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,
        0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,
        0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ,  0.   ])
delta = np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5,
   5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5,
   5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5,
   5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4,
   5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4,
   4, 5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4,
   4, 4, 5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4,
   4, 4, 4, 5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
   4, 4, 4, 4, 5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
   3, 4, 4, 4, 4, 5, 5, 5, 5, 0, 0, 0, 0, 1, 1, 1, 1, 2, 0, 2, 2, 3, 2,
   1, 3, 4, 3, 0, 3, 5, 4, 1, 0])

Solution

  • In [29]: t = np.array([1.2, 3.2, 4, 0, 0])
    In [30]: A = sparse.diags([t], offsets=[-1])
    In [31]: A
    Out[31]: 
    <6x6 sparse matrix of type '<class 'numpy.float64'>'
        with 5 stored elements (1 diagonals) in DIAgonal format>
    

    The conversion to coo strips out the 0s.

    In [32]: Ac = A.tocoo()
    In [33]: Ac
    Out[33]: 
    <6x6 sparse matrix of type '<class 'numpy.float64'>'
        with 3 stored elements in COOrdinate format>
    

    Look at the code for A.tocoo (the dia to coo version). It has (self.data != 0) mask.


    If I make the coo matrix directly it retains the zeros, at least temporarily:

    In [58]: A.A
    Out[58]: 
    array([[ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 1.2,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  3.2,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  4. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ]])
    In [59]: M = sparse.coo_matrix((t, (np.arange(1,6),np.arange(5))),shape=(6,6))
    In [60]: M
    Out[60]: 
    <6x6 sparse matrix of type '<class 'numpy.float64'>'
        with 5 stored elements in COOrdinate format>
    In [61]: M.A
    Out[61]: 
    array([[ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 1.2,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  3.2,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  4. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
           [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ]])
    

    An inplace zero removal:

    In [64]: M.eliminate_zeros()
    In [65]: M
    Out[65]: 
    <6x6 sparse matrix of type '<class 'numpy.float64'>'
        with 3 stored elements in COOrdinate format>