Search code examples
deep-learningpytorchrecurrent-neural-networkautoencoder

LSTM Autoencoder Output Layer


I am trying to get a LSTM autoencoder to recreate its inputs. So far I have:

class getSequence(nn.Module):
    def forward(self, x):
        out, _ = x
        return out


class getLast(nn.Module):
    def forward(self, x):
        out, states = x
        states = states[len(states) - 1]
        return states

class AEncoder(nn.Module):
    def __init__(self, input_size, first_layer, second_layer, n_layers):
        super(AEncoder, self).__init__()
        self.n_layers = n_layers
        self.encode = nn.Sequential(nn.LSTM(input_size, first_layer, batch_first=True),
                                    getSequence(),
                                    nn.ReLU(True),
                                    nn.LSTM(first_layer, second_layer),
                                    getLast())
        self.decode = nn.Sequential(nn.LSTM(second_layer, first_layer),
                                    getSequence(),
                                    nn.ReLU(True),
                                    nn.LSTM(first_layer, input_size),
                                    getSequence())

    def forward(self, x):
        x = x.float()
        x = self.encode(x)
        x = x.repeat(32, 1, 1) # repeating last hidden state of self.encode
        x = self.decode(x)
        return x

While researching I have been seeing some people adding a time-distributed dense layer at the end of the self.decode. I am confused if that final layer is specific to other tasks autoencoders are used for, if so, can I ignore that layer if I am only trying to recreate inputs?


Solution

  • The time-distributed dense layer as name suggested just an ordinary dense layer that apply to every temporal slice of an input, you can think it as special form of RNN cell, i.e without recurrent hidden state.

    So you can using any layer that is time-distributed as your output layer for an Autoencoder that deal with time-distributed inputs, e.g RNN layer with LSTM Cell, GRU Cell, Simple RNN Cell or time-distributed dense layer; As in research paper that propose the LSTM-Autoencoder, it basic model for reconstruct sequence of vectors (image patches or features) only using one LSTM layer in both encoder and decoder, model structure is:

    enter image description here

    Following is an example to using time-distributed dense layer in decoder:

    class Decoder(nn.Module):
    
      def __init__(self, seq_len, input_dim=64, n_features=1):
        super(Decoder, self).__init__()
    
        self.seq_len, self.input_dim = seq_len, input_dim
        self.hidden_dim, self.n_features = 2 * input_dim, n_features
    
        self.rnn = nn.LSTM(
          input_size=input_dim,
          hidden_size=self.hidden_dim,
          num_layers=1,
          batch_first=True)
    
        self.output_layer = nn.Linear(self.hidden_dim, n_features)
    
      def forward(self, x):
        x = x.repeat(self.seq_len, self.n_features)
        x = x.reshape((self.n_features, self.seq_len, self.input_dim))
        x, (hidden_n, cell_n) = self.rnn(x)
        x = x.reshape((self.seq_len, self.hidden_dim))
        return self.output_layer(x)