Search code examples
pythonsparse-matrix

how to show sparse matrix with increasing data


how can we have the sparse matrix to be displayed in a way that the value part of a sparse matrix is increasing? I think by default, column part is increasing.


Solution

  • Here's a variation on the basic approach of sorting the data, and displaying the coordinates in the same order

    First make a sample random matrix with integer values (for ease of display). I'm using coo format because that's the one that shows the simplest relation between data and coordinates.

    In [9]: M = (sparse.random(10,10,.1, 'coo')*10).astype(int)
    In [10]: M
    Out[10]: 
    <10x10 sparse matrix of type '<class 'numpy.int32'>'
        with 10 stored elements in COOrdinate format>
    In [11]: print(M)
      (1, 3)    1
      (3, 4)    7
      (5, 4)    1
      (7, 4)    8
      (5, 6)    9
      (6, 6)    7
      (5, 8)    4
      (6, 8)    1
      (4, 9)    5
      (9, 9)    1
    

    In ipython, the basic display is the repr version, while print shows the str version.

    Now get the sort order of the data:

    In [12]: idx = np.argsort(M.data)
    

    and make a new matrix with row and column in the same order:

    In [13]: M1 = sparse.coo_matrix((M.data[idx],(M.row[idx], M.col[idx])),shape=M.shape)    
    In [14]: M1
    Out[14]: 
    <10x10 sparse matrix of type '<class 'numpy.int32'>'
        with 10 stored elements in COOrdinate format>
    
    In [15]: print(M1)
      (1, 3)    1
      (5, 4)    1
      (6, 8)    1
      (9, 9)    1
      (5, 8)    4
      (4, 9)    5
      (3, 4)    7
      (6, 6)    7
      (7, 4)    8
      (5, 6)    9
    

    It's the same matrix, but the data array is sorted

    In [16]: M.A
    Out[16]: 
    array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 7, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
           [0, 0, 0, 0, 1, 0, 9, 0, 4, 0],
           [0, 0, 0, 0, 0, 0, 7, 0, 1, 0],
           [0, 0, 0, 0, 8, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
    
    In [17]: M1.A
    Out[17]: 
    array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 7, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 5],
           [0, 0, 0, 0, 1, 0, 9, 0, 4, 0],
           [0, 0, 0, 0, 0, 0, 7, 0, 1, 0],
           [0, 0, 0, 0, 8, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
    
    In [18]: M1.data
    Out[18]: array([1, 1, 1, 1, 4, 5, 7, 7, 8, 9])
    

    Converting the coo to csr format requires sorting (lexical for row and col), so both coo matrices are converted in the same way:

    In [19]: M.tocsr().data
    Out[19]: array([1, 7, 5, 1, 9, 4, 7, 1, 8, 1], dtype=int32)
    In [20]: M1.tocsr().data
    Out[20]: array([1, 7, 5, 1, 9, 4, 7, 1, 8, 1], dtype=int32)
    

    The cannonical csr order by rows and columns within those:

    In [23]: print(M1.tocsr())
      (1, 3)    1
      (3, 4)    7
      (4, 9)    5
      (5, 4)    1
      (5, 6)    9
      (5, 8)    4
      (6, 6)    7
      (6, 8)    1
      (7, 4)    8
      (9, 9)    1