i am trying to understand how python-glove computes most-similar
terms.
Is it using cosine similarity?
Example from python-glove github
https://github.com/maciejkula/glove-python/tree/master/glove
:
I know that from gensim's word2vec, the most_similar
method computes similarity using cosine distance.
The project website is a bit unclear on this point:
The Euclidean distance (or cosine similarity) between two word vectors provides an effective method for measuring the linguistic or semantic similarity of the corresponding words.
Euclidean distance is not the same as cosine similarity. It sounds like either works well enough, but it does not specify which is used.
However, we can observe the source of the repo you are looking at to see:
dst = (np.dot(self.word_vectors, word_vec)
/ np.linalg.norm(self.word_vectors, axis=1)
/ np.linalg.norm(word_vec))
It uses cosine similarity.