Is there a way to find similar docs like we do in word2vec
Like:
model2.most_similar(positive=['good','nice','best'],
negative=['bad','poor'],
topn=10)
I know we can use infer_vector,feed them to have similar ones, but I want to feed many positive and negative examples as we do in word2vec.
is there any way we can do that! thanks !
The doc-vectors part of a Doc2Vec
model works just like word-vectors, with respect to a most_similar()
call. You can supply multiple doc-tags or full vectors inside both the positive
and negative
parameters.
So you could call...
sims = d2v_model.docvecs.most_similar(positive=['doc001', 'doc009'], negative=['doc102'])
...and it should work. The elements of the positive
or negative
lists could be doc-tags that were present during training, or raw vectors (like those returned by infer_vector()
, or your own averages of multiple such vectors).