I have trained a Word2Vec model and I am trying to use it. When I input the most similar words of ‘动力', I got the output like this:
动力系统 0.6429724097251892
驱动力 0.5936785936355591
动能 0.5788494348526001
动力车 0.5579575300216675
引擎 0.5339343547821045
推动力 0.5152761936187744
扭力 0.501279354095459
新动力 0.5010953545570374
支撑力 0.48610919713974
精神力量 0.47970670461654663
But the problem is that if I input model.wv.similarity('动力','动力系统')
I got the result 0.0, which is not equal with
0.6429724097251892
what confused me more was that when I got the next similarity of word '动力' and word '驱动力', it showed
3.689349e+19
So why ? Did I make misunderstanding with the similarity? I need someone to tell me!! And the code is:
res = model.wv.most_similar('动力')
for r in res:
print(r[0],r[1])
print(model.wv.similarity('动力','动力系统'))
print(model.wv.similarity('动力','驱动力'))
print(model.wv.similarity('动力','动能'))
output:
动力系统 0.6429724097251892
驱动力 0.5936785936355591
动能 0.5788494348526001
动力车 0.5579575300216675
引擎 0.5339343547821045
推动力 0.5152761936187744
扭力 0.501279354095459
新动力 0.5010953545570374
支撑力 0.48610919713974
精神力量 0.47970670461654663
0.0
3.689349e+19
2.0
I have written a function to replace the model.wv.similarity
method.
def Similarity(w1,w2,model):
A = model[w1]; B = model[w2]
return sum(A*B)/(pow(sum(pow(A,2)),0.5)*pow(sum(pow(B,2)),0.5)
Where w1
and w2
are the words you input, model
is the Word2Vec model you have trained.