Search code examples
python-3.xgensimword2vec

How to turn a list of words into a list of vectors using a pre-trained word2vec model(Google)?


I am trying to learn word2vec.

I am using the code below to load the Google pre-trained word2vec model in Python 3. But I am unsure how to turn a list such as :["I", "ate", "apple"] to a list of vectors (ie how to get vectors from this model?).

import nltk
import gensim

# Load Google's pre-trained Word2Vec model.
model = gensim.models.KeyedVectors.load_word2vec_format('./model/GoogleNews-vectors-negative300.bin', binary=True)

Solution

  • You get the vector via idiomatic Python keyed-index-access (brackets). For example:

        wv_apple = model['apple']
    

    You can create a new list based on some operation on every item of an existing list via an idiomatic Python 'list comprehension' ([expression(x) for x in some_list]), For example:

        words = ["I", "ate", "apple"]
        vectors = [model[word] for word in words]