I get a ValueError: could not broadcast input array from shape (50) into shape (100)
preparing embedding matrix I have loaded glove and made the word to vec Found 400000 word vectors.
I did look at a bunch of similar questions but they all seem to deal with forgetting to add the +1 in the max number words, I think I have that covered but still have the issue. Any help deeply appreciated.
num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)
I also tried
num_words = min(MAX_NUM_WORDS, len(word2idx_inputs)) + 1
Using pre-trained word embeddings in a keras model?
I also tried this one as well
Keras word embeddings Glove: can't prepare the embedding matrix
but also was the +1 issue
FYI: Extreme newbie at this 1st time doing Seq to seq to due to the translating Tagalog into English
The Error that is received
Filling pre-trained embeddings...
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-acf0d8a4c4ca> in <module>
8 if embedding_vector is not None:
9 # words not found in embedding index will be all zeros.
---> 10 embedding_matrix[i] = embedding_vector
11
12 # create embedding layer
ValueError: could not broadcast input array from shape (50) into shape (100)
Code
# prepare embedding matrix
print('Filling pre-trained embeddings...')
num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)
embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
for word, i in word2idx_inputs.items():
if i < MAX_NUM_WORDS:
embedding_vector = word2vec.get(word)
if embedding_vector is not None:
# words not found in embedding index will be all zeros.
embedding_matrix[i] = embedding_vector
# create embedding layer
embedding_layer = Embedding(
num_words,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=max_len_input,
# trainable=True
)
# create targets, since we cannot use sparse
# categorical cross entropy when we have sequences
decoder_targets_one_hot = np.zeros(
(
len(input_texts),
max_len_target,
num_words_output
),
dtype='float32'
)
# assign the values
for i, d in enumerate(decoder_targets):
for t, word in enumerate(d):
if word != 0:
decoder_targets_one_hot[i, t, word] = 1
check the EMBEDDING_DIM value ,probably pre-trained data have less limit, as error shows shape(50) into shape(100). So set EMBEDDING_DIM =50.