Search code examples
pythonmachine-learningdeep-learningsentence-similarity

Similarity between two sentences using word2vec


sentence1 = "this is a sentence" sentence2 = "this is sentence 2" i want to find similarity between these two sentence . could somebody help me out with a complete code of it using Word2Vec


Solution

  • Assuming you have any word2vec utility as word2vec:

    import numpy as np
    
    words1 = sentence1.split(' ')
    words2 = sentence2.split(' ')
    
    #The meaning of the sentence can be interpreted as the average of its words
    sentence1_meaning = word2vec(words1[0])
    count = 1
    for w in words1[1:]:
        sentence1_meaning = np.add(sentence1_meaning, word2vec(w))
        count += 1
    sentence1_meaning /= count
    
    sentence2_meaning = word2vec(words2[0])
    count = 1
    for w in words2[1:]:
        sentence2_meaning = np.add(sentence2_meaning, word2vec(w))
        count += 1
    sentence2_meaning /= count
    
    #Similarity is the cosine between the vectors
    similarity = np.dot(sentence1_meaning, sentence2_meaning)/(np.linalg.norm(sentence1_meaning)*np.linalg.norm(sentence2_meaning))