I am trying to check the semantic and syntactic performance of a doc2vec model- doc2vec_model.accuracy(questions-words)
, but it doesnt seem to function since models.deprecated.doc2vec – Deep learning with paragraph2vec, says it has been deprecated since version 3.3.0 in the gensim package.It gives this error message
AttributeError: 'Doc2Vec' object has no attribute 'accuracy'
Though it works with word2vec model well, is there any way I can get it done apart from doc2vec_model.accuracy(questions-words)
? or it's impossible?
A few notes:
That 'accuracy()' test is only a test of word-vectors on analogy problems – an easy evaluation to run, used in a number of papers, but not the final authority on whether a set of word-vectors is better than others for a particular purpose. (When I've had a project-specific scoring method, sometimes the word-vectors that score best on project-specific goals don't score best on those analogies – especially if the word-vectors are being used for a classification or information-retrieval task.)
Further, the popular and fast PV-DBOW Doc2Vec
mode (dm=0
in gensim) doesn't train word-vectors at all, unless you add another setting (dbow_words=1
). Such untrained word-vectors will be in random locations, scoring awfully on the analogies-accuracy.
But, using either PV-DM (dm=1
) mode, or adding dbow_words=1
to PV-DBOW, will get word-vectors from Doc2Vec
, and you might still want to run the analogies test. Fortunately, analogy-evaluation options have been retained & even expanded on the KeyedVectors
object that's held in the Doc2Vec
wv
property. You can call the old accuracy()
method there:
But there's also a slightly-different scoring evaluate_word_pairs()
:
(And in the 4.0.0 release there'll be a [evaluate_word_analogies()][1]
which replaces `accuracy().)