I want to concatenate Char Embeddings (generated using CNN) with my Word Embedding (using Glove vectors) but getting the error since the shape of Char Embeddings is different from Word Embeddings.
How can fix the error or concatenate these?
self.character_embedding_weights = tf.get_variable(
"character_embedding_weights",
shape=[dataset.alphabet_size, parameters['character_embedding_dimension']],
initializer=initializer)
embedded_characters = tf.nn.embedding_lookup(self.character_embedding_weights,
self.input_token_character_indices, name='embedded_characters')
if self.verbose:
print("embedded_characters: {0}".format(embedded_characters))
utils_tf.variable_summaries(self.character_embedding_weights)
s = tf.shape(embedded_characters)
char_embeddings = tf.reshape(embedded_characters, shape=[-1,25,20])
# Conv #1
conv1 = tf.layers.conv1d(
inputs=char_embeddings,
filters=30,
kernel_size=3,
padding="valid",
activation=tf.nn.relu)
# Conv #2
conv2 = tf.layers.conv1d(
inputs=conv1,
filters=30,
kernel_size=3,
padding="valid",
activation=tf.nn.relu)
pool2 = tf.layers.max_pooling1d(inputs=conv2, pool_size=2, strides=2)
# # Dense Layer
character_embed_output = tf.layers.dense(inputs=pool2, units=32, activation=tf.nn.relu)
Here, I'm concatenating the token and char embeddings.
with tf.variable_scope("concatenate_token_and_character_vectors"):
if self.verbose:
print('embedded_tokens: {0}'.format(embedded_tokens))
token_lstm_input = tf.concat([character_embed_output, embedded_tokens],
axis=1, name='token_lstm_input')
Getting this error
ValueError: Shape must be rank 3 but is rank 2 for'concatenate_token_and_character_vectors/token_lstm_input' (op: 'ConcatV2') with input shapes:[?,10,32], [?,100], [].**
I'm working with this repo https://github.com/Franck-Dernoncourt/NeuroNER It is using LSTM for Char-Level-Embedding and I want to use CNN for this.
Link where it is using LSTM for Char-Level-Embedding and I have my code using CNN as mentioned above.
Comment if any other info or code required.
Finally, I was able to resolve the problem but Flattening the char embedding then it can be easily concatenated with Word embeddings. By adding this line, It worked.
character_embed_output = tf.layers.Flatten()(character_embed_output)