Search code examples
deep-learningpytorchlayerembedding

How to implement low-dimensional embedding layer in pytorch


I recently read a paper about embedding.

In Eq. (3), the f is a 4096X1 vector. the author try to compress the vector in to theta (a 20X1 vector) by using an embedding matrix E.

The equation is simple theta = E*f

I was wondering if it can using pytorch to achieve this goal, then in the training, the E can be learned automatically.

How to finish the rest? thanks so much.

The demo code is follow:

import torch
from torch import nn

f = torch.randn(4096,1)

Solution

  • Assuming your input vectors are one-hot that is where "embedding layers" are used, you can directly use embedding layer from torch which does above as well as some more things. nn.Embeddings take non-zero index of one-hot vector as input as a long tensor. For ex: if feature vector is

    f = [[0,0,1], [1,0,0]]
    

    then input to nn.Embeddings will be

    input = [2, 0]

    However, what OP has asked in question is getting embeddings by matrix multiplication and below I will address that. You can define a module to do that as below. Since, param is an instance of nn.Parameter it will be registered as a parameter and will be optimized when you call Adam or any other optimizer.

    class Embedding(nn.Module):
        def __init__(self, input_dim, embedding_dim):
            super().__init__()
            self.param = torch.nn.Parameter(torch.randn(input_dim, embedding_dim))
    
        def forward(self, x):
            return torch.mm(x, self.param)
    

    If you notice carefully this is the same as a linear layer with no bias and slightly different initialization. Therefore, you can achieve the same by using a linear layer as below.

    self.embedding = nn.Linear(4096, 20, bias=False)
    # change initial weights to normal[0,1] or whatever is required
    embedding.weight.data = torch.randn_like(embedding.weight)