Search code examples
word2vec

Why the output of model.wv.similarity() in Word2Vec results different with model.wv.similar()?


I have trained a Word2Vec model and I am trying to use it. When I input the most similar words of ‘动力', I got the output like this:

动力系统 0.6429724097251892
驱动力 0.5936785936355591
动能 0.5788494348526001
动力车 0.5579575300216675
引擎 0.5339343547821045
推动力 0.5152761936187744
扭力 0.501279354095459
新动力 0.5010953545570374
支撑力 0.48610919713974
精神力量 0.47970670461654663

But the problem is that if I input model.wv.similarity('动力','动力系统') I got the result 0.0, which is not equal with

0.6429724097251892

what confused me more was that when I got the next similarity of word '动力' and word '驱动力', it showed

3.689349e+19

So why ? Did I make misunderstanding with the similarity? I need someone to tell me!! And the code is:

res = model.wv.most_similar('动力')
for r in res:
    print(r[0],r[1])
print(model.wv.similarity('动力','动力系统'))
print(model.wv.similarity('动力','驱动力'))
print(model.wv.similarity('动力','动能'))

output:

动力系统 0.6429724097251892
驱动力 0.5936785936355591
动能 0.5788494348526001
动力车 0.5579575300216675
引擎 0.5339343547821045
推动力 0.5152761936187744
扭力 0.501279354095459
新动力 0.5010953545570374
支撑力 0.48610919713974
精神力量 0.47970670461654663
0.0
3.689349e+19
2.0

Solution

  • I have written a function to replace the model.wv.similarity method.

    def Similarity(w1,w2,model):
        A = model[w1]; B = model[w2]
        return sum(A*B)/(pow(sum(pow(A,2)),0.5)*pow(sum(pow(B,2)),0.5)
    

    Where w1 and w2 are the words you input, model is the Word2Vec model you have trained.