Search code examples
neural-networknlpkeraskeras-layer

Issue with TimeDistributed LSTMs


I'm currently trying to implement the Dependency Sensitive Convolutional Neural Network for Modeling Documents by Rui Zhang in Keras. For me this is the first network to implement in Keras, so I came up with some questions.

The network looks as follows:

DSCNN

I think the implementation is already pretty far, but there is a big issue with the model initialization. I've created a gist: https://gist.github.com/pexmar/cec8dfdfe46b24ea7d1765f398df8d9d

The error that occurs is the following:

Traceback (most recent call last):
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/classify.py", line 64, in <module>
    model = create_model(embeddings, max_sentences_per_doc, max_sentence_len, kernel_size=[3, 4, 5], filters=100)
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 38, in create_model
    sentence_modeling = [shared_sentence_lstm(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 38, in <listcomp>
    sentence_modeling = [shared_sentence_lstm(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/engine/topology.py", line 528, in __call__
    self.build(input_shapes[0])
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/wrappers.py", line 104, in build
    self.layer.build(child_input_shape)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 959, in build
    self.input_dim = input_shape[2]
IndexError: tuple index out of range

Do you know where my mistake is?

Do you see other mistakes? If I comment out the erroneous line, I get the following error:

Traceback (most recent call last):
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/classify.py", line 64, in <module>
    model = create_model(embeddings, max_sentences_per_doc, max_sentence_len, kernel_size=[3, 4, 5], filters=100)
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 43, in create_model
    sentence_modeling = [shared_sentence_lstm_2(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
  File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 43, in <listcomp>
    sentence_modeling = [shared_sentence_lstm_2(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 252, in __call__
    return super(Recurrent, self).__call__(inputs, **kwargs)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/engine/topology.py", line 554, in __call__
    output = self.call(inputs, **kwargs)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 290, in call
    preprocessed_input = self.preprocess_input(inputs, training=None)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 1033, in preprocess_input
    return K.concatenate([x_i, x_f, x_c, x_o], axis=2)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1527, in concatenate
    return tf.concat([to_dense(x) for x in tensors], axis)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1075, in concat
    dtype=dtypes.int32).get_shape(
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 669, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Will it also occur if the first error is fixed? What is the mistake here?

Thank you in advance for your answer!


Solution

  • I found the reason for the error. You cannot apply a TimeDistributed Layer at that position. I had to replace it with a normal LSTM (which also would make more sense considering the paper). Then it worked.