Search code examples
tensorflowsparse-matrixmatrix-factorization

Use coo_matrix in TensorFlow


I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.

Is it possible to use coo_matrix to initialize a variable in tensorflow?

Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.

I hope that you understand my question and my problem otherwise comment and i will try to fix it.


Solution

  • The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.

    A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:

    • indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
    • values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
    • dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.

    For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :

    sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
    # ...
    train_op = ...
    
    coo_matrix = scipy.sparse.coo_matrix(...)
    
    # Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
    # SciPy stores the row and column coordinates as separate vectors, so we must 
    # stack and transpose them to make an indices matrix of the appropriate shape.
    tf_coo_matrix = tf.SparseTensorValue(
        indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
        values=coo_matrix.data,
        dense_shape=coo_matrix.shape)
    

    Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:

    sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})