I'm currently trying to implement the Dependency Sensitive Convolutional Neural Network for Modeling Documents by Rui Zhang in Keras. For me this is the first network to implement in Keras, so I came up with some questions.
The network looks as follows:
I think the implementation is already pretty far, but there is a big issue with the model initialization. I've created a gist: https://gist.github.com/pexmar/cec8dfdfe46b24ea7d1765f398df8d9d
The error that occurs is the following:
Traceback (most recent call last):
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/classify.py", line 64, in <module>
model = create_model(embeddings, max_sentences_per_doc, max_sentence_len, kernel_size=[3, 4, 5], filters=100)
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 38, in create_model
sentence_modeling = [shared_sentence_lstm(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 38, in <listcomp>
sentence_modeling = [shared_sentence_lstm(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/engine/topology.py", line 528, in __call__
self.build(input_shapes[0])
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/wrappers.py", line 104, in build
self.layer.build(child_input_shape)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 959, in build
self.input_dim = input_shape[2]
IndexError: tuple index out of range
Do you know where my mistake is?
Do you see other mistakes? If I comment out the erroneous line, I get the following error:
Traceback (most recent call last):
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/classify.py", line 64, in <module>
model = create_model(embeddings, max_sentences_per_doc, max_sentence_len, kernel_size=[3, 4, 5], filters=100)
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 43, in create_model
sentence_modeling = [shared_sentence_lstm_2(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
File "/Users/peter/Masterarbeit/python-projects/dscnn-keras/model.py", line 43, in <listcomp>
sentence_modeling = [shared_sentence_lstm_2(sentence_modeling[i]) for i in range(max_sentences_per_doc)]
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 252, in __call__
return super(Recurrent, self).__call__(inputs, **kwargs)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/engine/topology.py", line 554, in __call__
output = self.call(inputs, **kwargs)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 290, in call
preprocessed_input = self.preprocess_input(inputs, training=None)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/layers/recurrent.py", line 1033, in preprocess_input
return K.concatenate([x_i, x_f, x_c, x_o], axis=2)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1527, in concatenate
return tf.concat([to_dense(x) for x in tensors], axis)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1075, in concat
dtype=dtypes.int32).get_shape(
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 669, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/Users/peter/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.
Will it also occur if the first error is fixed? What is the mistake here?
Thank you in advance for your answer!
I found the reason for the error. You cannot apply a TimeDistributed Layer at that position. I had to replace it with a normal LSTM (which also would make more sense considering the paper). Then it worked.