Search code examples
pythontensorflowdeep-learningscipysparse-matrix

convert 2D sparse matrix to 3D matrix


I want to convert the 2D sparse matrix to 3D matrix as i need to give it as the input the conv1d layer, which expects 3D tensor.

Here is the input for the conv1d layer.

from scipy.sparse import hstack
other_features_train = hstack((X_train_state_ohe, X_train_teacher_ohe, X_train_grade_ohe, X_train_category_ohe, X_train_subcategory_ohe,X_train_price_norm,X_train_number_norm))
other_features_cv = hstack((X_cv_state_ohe, X_cv_teacher_ohe, X_cv_grade_ohe,X_cv_category_ohe,X_cv_subcategory_ohe,X_cv_price_norm,X_cv_number_norm))
other_features_test = hstack((X_test_state_ohe, X_test_teacher_ohe, X_test_grade_ohe,X_test_category_ohe,X_test_subcategory_ohe,X_test_price_norm,X_test_number_norm))

print(other_features_train.shape)
print(other_features_cv.shape)
print(other_features_test.shape)

shape of the train , cv and test data

(49041, 101)
(24155, 101)
(36052, 101)

This is my model architecture.

tf.keras.backend.clear_session()

vec_size = 300

input_model_1 = Input(shape=(300,),name='essay')
embedding = Embedding(vocab_size_essay, vec_size, weights=[word_vector_matrix], input_length = max_length, trainable=False)(input_model_1)
lstm = LSTM(16)(embedding)
flatten_1 = Flatten()(lstm)

input_model_2 = Input(shape=(101, ),name='other_features')
conv_layer1 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(input_model_2)
conv_layer2 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer1)
conv_layer3 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer2)
flatten_2 = Flatten()(conv_layer3)

concat_layer = concatenate(inputs=[flatten_1, flatten_2],name='concat')

dense_layer_1 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_1')(concat_layer)

dropout_1 = Dropout(0.2)(dense_layer_1)

dense_layer_2 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_2')(dropout_1)

dropout_2 = Dropout(0.2)(dense_layer_2)

dense_layer_3 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_3')(dropout_2)

output = Dense(units=2, activation='softmax', kernel_initializer='glorot_uniform', name='output')(dense_layer_3)

model_3 = Model(inputs=[input_model_1,input_model_2],outputs=output)

and am getting this error when am trying to give 2d array.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-44c8f6f0caa7> in <module>
      9 
     10 input_model_2 = Input(shape=(101, ),name='other_features')
---> 11 conv_layer1 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(input_model_2)
     12 conv_layer2 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer1)
     13 conv_layer3 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer2)

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
    810         # are casted, not before.
    811         input_spec.assert_input_compatibility(self.input_spec, inputs,
--> 812                                               self.name)
    813         graph = backend.get_graph()
    814         with graph.as_default(), backend.name_scope(self._name_scope()):

~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    175                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
    176                          str(ndim) + '. Full shape received: ' +
--> 177                          str(x.shape.as_list()))
    178     if spec.max_ndim is not None:
    179       ndim = x.shape.ndims

ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 101]

model_3.summary()
model_3.compile(loss = "binary_crossentropy", optimizer=Adam()

Compile the model

model_3.compile(loss = "binary_crossentropy", optimizer=Adam(), metrics=["accuracy"])

Fit the model

model_3.fit(train_features,y_train_ohe,batch_size=16,epochs=10,validation_data=(cv_features,y_cv_ohe))

train_features = [train_text, other_features_train]
cv_features = [cv_text, other_features_cv]
test_featues = [test_text, other_features_test]

Text Features

train_text = X_train['essay'].tolist()
cv_text = X_cv['essay'].tolist()
test_text = X_test['essay'].tolist()

token = Tokenizer()
token.fit_on_texts(train_text)

vocab_size_essay = len(token.word_index) + 1
print("No. of unique words = ", vocab_size_essay)

encoded_train_text = token.texts_to_sequences(train_text)
encoded_cv_text = token.texts_to_sequences(cv_text)
encoded_test_text = token.texts_to_sequences(test_text)

#print(encoded_test_text[:5])

max_length = 300

train_text = pad_sequences(encoded_train_text, maxlen=max_length, padding='post')
cv_text = pad_sequences(encoded_cv_text, maxlen=max_length, padding='post')
test_text = pad_sequences(encoded_test_text, maxlen=max_length, padding='post')

print("\n")
print(train_text.shape)
print(cv_text.shape)
print(test_text.shape)

shape of text features

No. of unique words =  41468


(49041, 300)
(24155, 300)
(36052, 300)

So, I want the reshape in

(49041,101,1) 
(24155,101,1) 
(36052,101,1) 

Please suggest how to do it.


Solution

  • Solution

    The solution here demands clarity on a few concepts as follows. I will explain these concepts in the following sections.

    • what keras expects as inputs
    • what kind of modifications could be done to your keras model to allow sparse input matrices
    • converting a 2D numpy array to a 3D numpy array
    • back-and-forth conversion between a sparse and a non-sparse (or, dense) array using
      • scipy.sparse.coo_matrix for 2D numpy array
      • sparse.COO for 3D numpy array

    Using sparse matrices as input to tf.keras models

    • One option is to convert your sparse input matrix into the non-sparse (dense) format using todense() method. This makes the matrix a regular numpy array. See kaggle discussion, [3] and [4].

    • Another option is to write your own custom Layers for both sparse and dense inputs by subclassing tf.keras.layers.Layer class. See this article, [2].

    • It appears that tensorflow.keras now allows model training with sparse weights. So, somewhere it has the ability to handle sparsity. You may want to explore the documentation, [1] for this aspect.

    Adding a new-axis to a numpy array

    You can add another axis to a numpy array using np.newaxis as follows.

    import numpy as np
    
    ## Make a 2D array
    a2D = np.zeros((10,10))
    
    # Make a few elements non-zero in a2D
    aa = a2D.flatten()
    aa[[0,13,41,87,98]] = np.random.randint(1,10,size=5)
    a2D = aa.reshape(a2D.shape)
    
    # Make 3D array from 2D array by adding another axis
    a3D = a2D[:,:,np.newaxis]
    #print(a2D)
    print('a2D.shape: {}\na3D.shape: {}'.format(a2D.shape, a3D.shape))
    

    Output:

    a2D.shape: (10, 10)
    a3D.shape: (10, 10, 1)
    

    Having said that, please take a look at the links in the References section.

    Sparse Arrays

    Since a sparse array has very few non-zero values, a regular numpy array when converted into a sparse array, stores it in a few sparse-formats:

    • csr_matrix: row-wise arrays of non-zero values and indices
    • csc-matrix: column-wise array of nonzero values and indices
    • coo-matrix: a table with three columns
      • row
      • column
      • non-zero value

    Scipy Sparse Matrices expect 2D input-matrix

    However, scipy.sparse implementation of the above three types of sparse-matrices, only considers 2D non-sparse matrix as input.

    from scipy.sparse import csr_matrix, coo_matrix
    
    coo_a2D = coo_matrix(a2D)
    coo_a2D.shape # output: (10, 10)
    
    # scipy.sparse only accepts 2D input matrices
    # the following line will throw an !!! ERROR !!!
    coo_a3D = coo_matrix(coo_a2D.todense()[:,:,np.newaxis])
    

    Sparse Matrix from 3D non-sparse input matrix

    Yes, you can do this using the sparse library. It also supports scipy.sparse and numpy arrays. To convert from sparse matrix to non-sparse (dense) format (this is NOT a Dense Layer in neural networks), use the todense() method.

    ## Installation
    # pip install -U sparse
    
    import sparse
    
    ## Create sparse coo_matrix from a
    # 3D numpy array (dense format)
    coo_a3D = sparse.COO(a3D)
    
    ## Test that
    #   coo_a3D == coo made from (coo_a2D + newaxis)
    print(
        (coo_a3D == sparse.COO(coo_a2D.todense()[:,:,np.newaxis])).all()
    ) # output: True
    ## Convert to dense (non-sparse) format
    #   use: coo_a3D.todense()
    print((a3D == coo_a3D.todense()).all()) # output: True
    

    scipy.sparse.coo_matrix vs. sparse.COO

    Source

    PyTorch: torch.sparse 🔥 ⭐

    PyTorch library also provides ways to work with sparce tensors.

    References

    1. Train sparse TensorFlow models with Keras

    2. How to design deep learning models with sparse inputs in Tensorflow Keras

    3. Neural network for sparse matrices

    4. Training Neural network with scipy sparse matrix?

    5. Documentation of sparse library