Search code examples
pythonmachine-learningkerasgensimword2vec

Error while implementing Word2Vec model with embedding_vector


I'm getting an AttributeError while trying to implement with embedding_vector:

from gensim.models import KeyedVectors
embeddings_dictionary = KeyedVectors.load_word2vec_format('model', binary=True)

embedding_matrix = np.zeros((vocab_size, 100))
for word, index in tokenizer.word_index.items():
    embedding_vector = embeddings_dictionary.get(word)
    if embedding_vector is not None:
        embedding_matrix[index] = embedding_vector

AttributeError: 'Word2VecKeyedVectors' object has no attribute 'get'


Solution

  • Yes, gensim's KeyedVectors abstraction does not offer a get() method. (What docs or example are you following that suggests it does?)

    You can use standard Python []-indexing, eg:

    embedding_dictionary[word]
    

    Though, there isn't really a reason for your loop copying each vector into your own embedding_matrix. The KeyedVectors instance already has a raw array, with each vector in a row, in the order of the KeyedVectors .index2entity list – in its vectors property:

    embedding_dictionary.vectors