Search code examples
machine-learningnlpword2vecgensimdoc2vec

Gensim Doc2Vec.infer_vector() equivalent in KeyedVector


I have a working app using doc2vec from gensim. I know the KeyedVector is now the recommended approach, and trying to port over however I am not sure what is the equivalent method for the infer_vector method in Doc2Vec?

Or better put, how do I obtain a document vector for an entire document using the KeyedVector model to write to my Annoy model?


Solution

  • KeyedVectors doesn't replace Doc2Vec, it's a storage and index system for word vectors:

    Word vector storage and similarity look-ups. Common code independent of the way the vectors are trained(Word2Vec, FastText, WordRank, VarEmbed etc)

    The word vectors are considered read-only in this class.

    This class doesn't know anything about tagged documents and it can't implement infer_vector or an equivalent because this procedure requires training and the idea of KeyedVectors is to abstract from the training method.