Search code examples
deep-learningnlptext-processingword-embedding

Better way to combine Word embedding to get embedding of a sentence


I have seen in many kaggle kernels and tutorials, average word embeddings is considered to get embedding of a sentence. But, i am wondering if this is a correct approach.Since it discards the positional information of the words in the sentence. is there a better way to combine embedding? maybe hierarchically combining them in a particular way?


Solution

  • If you need a simple but yet effective approach, Sif embedding is perfectly fine. It averages word vector in a sentence and removes its first principal component. It is much superior to averaging word vectors. The code available online here. Here is the main part:

    svd = TruncatedSVD(n_components=1, random_state=rand_seed, n_iter=20)
    svd.fit(all_vector_representation)
    svd = svd.components_
    
    XX2 = all_vector_representation - all_vector_representation.dot(svd.transpose()) * svd
    

    Where all_vector_representation is the average embedding of all sentences in your dataset.

    Other sophisticated approaches also exist out there like ELMO, Transformer and etc.