Search code examples
scipysparse-matrix

Accessing sparse matrix elements


I have a very large sparse matrix of the type 'scipy.sparse.coo.coo_matrix'. I can convert to csr with .tocsr(), however .todense() will not work since the array is too large. I want to be able to extract elements from the matrix as I would do with a regular array, so that I may pass row elements to a function.

For reference, when printed, the matrix looks as follows:

(7, 0)  0.531519363001
(48, 24)    0.400946334437
(70, 6) 0.684460955022
...

Solution

  • Make a matrix with 3 elements:

    In [550]: M = sparse.coo_matrix(([.5,.4,.6],([0,1,2],[0,5,3])), shape=(5,7))
    

    It's default display (repr(M)):

    In [551]: M
    Out[551]: 
    <5x7 sparse matrix of type '<class 'numpy.float64'>'
        with 3 stored elements in COOrdinate format>
    

    and print display (str(M)) - looks like the input:

    In [552]: print(M)
      (0, 0)    0.5
      (1, 5)    0.4
      (2, 3)    0.6
    

    convert to csr format:

    In [553]: Mc=M.tocsr()
    In [554]: Mc[1,:]   # row 1 is another matrix (1 row):
    Out[554]: 
    <1x7 sparse matrix of type '<class 'numpy.float64'>'
        with 1 stored elements in Compressed Sparse Row format>
    
    In [555]: Mc[1,:].A    # that row as 2d array
    Out[555]: array([[ 0. ,  0. ,  0. ,  0. ,  0. ,  0.4,  0. ]])
    
    In [556]: print(Mc[1,:])    # like 2nd element of M except for row number
      (0, 5)    0.4
    

    Individual element:

    In [560]: Mc[1,5]
    Out[560]: 0.40000000000000002
    

    The data attributes of these format (if you want to dig further)

    In [562]: Mc.data
    Out[562]: array([ 0.5,  0.4,  0.6])
    In [563]: Mc.indices
    Out[563]: array([0, 5, 3], dtype=int32)
    In [564]: Mc.indptr
    Out[564]: array([0, 1, 2, 3, 3, 3], dtype=int32)
    In [565]: M.data
    Out[565]: array([ 0.5,  0.4,  0.6])
    In [566]: M.col
    Out[566]: array([0, 5, 3], dtype=int32)
    In [567]: M.row
    Out[567]: array([0, 1, 2], dtype=int32)