Search code examples
nlppytorchembedding

How do I get words from an embedded vector?


How can I convert them into their original words when I generate word vectors in the generator? I used the nn.Embedding module built into pytorch to embed words.


Solution

  • Since you didn't provide any code, I am using below code with comments to answers your query. Feel free to add more information for your particular use case.

    import torch
    # declare embeddings
    embed = torch.nn.Embedding(5,10)
    
    # generate embedding for word [4] in vocab 
    word = torch.tensor([4])
    
    # search function for searching through embedding
    def search(vector, distance_fun):
        weights = embed.weight
        min = torch.tensor(float('inf'))
        idx = -1
        v, e = weights.shape
    
        # each vector in embeding is corresponding to one of the word.
        # use a distance function to compare with vector
        for i in range(v):
            dist = distance_fun(vector, weights[i])
            if (min<dist):
                min = dist
                idx = i
        return i  
    # searching with squared distance
    search(word, lambda x,y: ((x-y)**2).sum()