Search code examples
machine-learningnlppytorchlstmtext-processing

Understanding the output of LSTM predictions


  • It's a 15-class classification model, OUTPUT_DIM = 15. I'm trying to input a frequency vector like this one 'hi my name is' => [1,43,2,56].

  • When I call predictions = model(x_train[0]) I get an array of size torch.Size([100, 15]), instead of just a 1D array with 15 classes like this: torch.Size([15]). What's happening? Why is this the output? How can I fix it? Thank you in advance, more info below.

The model (from main docs) is the following:

import torch.nn as nn

class RNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim):
        
        super().__init__()
        
        self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_dim)
        self.hidden2tag = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, text):
                
        embeds = self.word_embeddings(text)
        lstm_out, _ = self.lstm(embeds.view(len(text), 1, -1))
        tag_space = self.hidden2tag(lstm_out.view(len(text), -1))
        tag_scores = F.log_softmax(tag_space, dim=1)   

        return tag_scores

Parameters:

INPUT_DIM = 62288
EMBEDDING_DIM = 64
HIDDEN_DIM = 128
OUTPUT_DIM = 15

Solution

  • The LSTM function in Pytorch returns not just the output of the last timestep but all outputs instead (this is useful in some cases). So in your example you seem to have exactly 100 timesteps (the amount of timesteps is just your sequence length).

    But since you are doing classification you just care about the last output. You can normally get it like this:

    outputs, _ = self.lstm(embeddings)
    # shape: batch_size x 100 x 15
    output = outputs[:, -1]    
    # shape: batch_size x 1 x 15