Search code examples
pythongensimword2vec

Gensim word2vec in python3 missing vocab


I'm using gensim implementation of Word2Vec. I have the following code snippet:

print('training model')
model = Word2Vec(Sentences(start, end))
print('trained model:', model)
print('vocab:', model.vocab.keys())

When I run this in python2, it runs as expected. The final print is all the words in the vocabulary.

However, if I run it in python3, I get an error:

trained model: Word2Vec(vocab=102, size=100, alpha=0.025)
Traceback (most recent call last):
  File "learn.py", line 58, in <module>
    train(to_datetime('-4h'), to_datetime('now'), 'model.out')
  File "learn.py", line 23, in train
    print('vocab:', model.vocab.keys())
AttributeError: 'Word2Vec' object has no attribute 'vocab'

What is going on? Is gensim word2vec not compatible with python3?


Solution

  • Are you using the same version of gensim in both places? Gensim 1.0.0 moves vocab to a helper object, so whereas in pre-1.0.0 versions of gensim (in Python 2 or 3), you can use:

    model.vocab
    

    ...in gensim 1.0.0+ you should instead use (in Python 2 or 3)...

    model.wv.vocab