Search code examples
pythonarraysnumpysparse-matrix

Assign a the value of a sparse matrix to numpy array


import numpy as np
import scipy.sparse as scsp
from scipy.sparse import csr_matrix,lil_matrix

# create an empty numpy matrix
wi=np.empty((num_clusters*num_cluster_neurons, input))   
for i in range(num_clusters*num_cluster_neurons):         
    temp_neuron_prob=dic_cluster_prob[dic_neuron_cluster[i]]

    #create 1*input shape sparse matrix according to probability
    lil=lil_matrix(scsp.rand(1, input, temp_neuron_prob))

    #want to assign the 1*input sparse matrix to one slice of the numpy matrix
    wi[i,:]=lil[:]

I tried to assign the value of a lil_matrix to one slice of numpy array, but it gives the error 'setting an array element with a sequence'

I want to know why comes this error because they have the same size and how could I do to improve efficiency, since numpy array is faster than sparse matrix(lil_matrix).

I want to use numpy array to have the values created by the sparse matrix


Solution

  • A sparse matrix is not an array subclass (like np.matrix), and doesn't necessarily behave like one either (though in many ways it does try to).

    In [129]: arr = np.zeros((3,4),int)
    In [130]: M = sparse.lil_matrix([0,1,2,0])
    In [131]: M.shape
    Out[131]: (1, 4)
    In [132]: arr[0,:] = M
    ...
    ValueError: setting an array element with a sequence.
    

    But if I first convert the sparse matrix to an array or matrix, the assignment works:

    In [133]: arr[0,:] = M.A
    In [134]: arr[0,:] = M.todense()
    In [135]: arr
    Out[135]: 
    array([[0, 1, 2, 0],
           [0, 0, 0, 0],
           [0, 0, 0, 0]])
    

    As a general rule, sparse matrices can't be plugged into numpy code. The exception is when numpy code that delegates the task to the objects own methods.

    Looks like you are trying to generate something like:

    In [148]: arr = np.zeros((3,5),float)
    In [149]: for i in range(arr.shape[0]):
         ...:     arr[i,:] = sparse.rand(1,5, .2*(i+1)).A
         ...:     
    In [150]: arr
    Out[150]: 
    array([[ 0.        ,  0.        ,  0.82470353,  0.        ,  0.        ],
           [ 0.        ,  0.43339367,  0.99427277,  0.        ,  0.        ],
           [ 0.        ,  0.99843277,  0.05182824,  0.1705916 ,  0.        ]])
    

    A pure sparse equivalent might be:

    In [151]: alist = []
    In [152]: for i in range(3):
         ...:     alist.append(sparse.rand(1,5, .2*(i+1)))
         ...:     
         ...:     
    In [153]: alist
    Out[153]: 
    [<1x5 sparse matrix of type '<class 'numpy.float64'>'
        with 1 stored elements in COOrdinate format>,
     <1x5 sparse matrix of type '<class 'numpy.float64'>'
        with 2 stored elements in COOrdinate format>,
     <1x5 sparse matrix of type '<class 'numpy.float64'>'
        with 3 stored elements in COOrdinate format>]
    In [154]: sparse.vstack(alist)
    Out[154]: 
    <3x5 sparse matrix of type '<class 'numpy.float64'>'
        with 6 stored elements in COOrdinate format>
    In [155]: _.A
    Out[155]: 
    array([[ 0.        ,  0.        ,  0.        ,  0.19028467,  0.        ],
           [ 0.        ,  0.        ,  0.        ,  0.92668274,  0.67424419],
           [ 0.96208905,  0.63604635,  0.        ,  0.69463657,  0.        ]])
    

    But considering that sparse.vstack uses sparse bmat to join the matrices into a new one, and bmat combines the coo attributes of the components, the dense array accumulation approach could well be faster.