Search code examples
machine-learningdeep-learningpytorchtorch

Get 2D output from the embedding layer in pytorch


I have a X_train with size of (2, 100). I want to use the 250 first of the data, and use the second 250 of this matrix as the input of embedding and convert that to a matrix with size 2*3.

I read a lot about the embedding layer in pytorch, however I did not understand it well. I don't know how to get a 2*3 as the output of the embedding layer. Could you please help me with that? Here is a simple example.

import torch
import torch.nn as nn

X_train = np.random.randint(10, size = (2, 100))

X_train_notmbedding = X_train[:, 0:50] # not embedding  (2,50)
X_train_mbedding    = X_train[:, 50:100] #embedding  (2, 50)
X_train_mbedding = torch.LongTensor([X_train_mbedding])
embedding = nn.Embedding(50, 3)
embeding_output = embedding(X_train_mbedding) # I want to get a embedding output as (2,3)

#X_train_new = torch.cat([X_train_notmbedding, embeding_output], 1) # here I want to build a matrix with size (2, 53)

Solution

  • From the discussion, it looks like your understanding of Embeddings is not accurate.

    1. Only use 1 Embedding for 1 feature. In your example you are combining dates, ids etc. in 1 Embedding. Even in the medium article, they are using separate embeddings.
    2. Think of Embedding as one-hot encoding on steroids (less memory, data co-relation etc.). If you do not understand one-hot encoding I would start there first.
    3. KWH is already a real value not categorical. Use it as a linear input to the network (after normalization).
    4. ID: I do not know what ID denotes in your data, if it is a unique ID for each datapoint, it is not useful and should be excluded.

    If the above does not make sense, I would start with a simple network using LSTM and make it work first before using an advanced architecture.