pytorch time-series lstm autoencoder dropout

How to deal with dropout in between LSTM layers when using PackedSequence in PyTorch?

I'm creating an LSTM Autoencoder for feature extraction for my master's thesis. However, I'm having a lot of trouble with combining dropout with LSTM layers.

Since it's an Autoencoder, I'm having a bottleneck which is achieved by having two separate LSTM layers, each with num_layers=1, and a dropout in between. I have time series with very different lengths and have found packed sequences to be a good idea for that reason.

But, from my experiments, I must pack the data before the first LSTM, unpack before the dropout, then pack again before the second LSTM. This seems wildly inefficient. Is there a better way? I'm providing some example code and an alternative way to implement it below.

Current, working, but possibly suboptimal solution:

class Encoder(nn.Module):

    def __init__(self, seq_len, n_features, embedding_dim, hidden_dim, dropout):
        super(Encoder, self).__init__()

        self.seq_len = seq_len
        self.n_features = n_features
        self.embedding_dim = embedding_dim
        self.hidden_dim = hidden_dim

        self.lstm1 = nn.LSTM(
            input_size=n_features,
            hidden_size=self.hidden_dim,
            num_layers=1,
            batch_first=True,
        )

        self.lstm2 = nn.LSTM(
            input_size=self.hidden_dim,
            hidden_size=embedding_dim,
            num_layers=1,
            batch_first=True,
        )

        self.drop1 = nn.Dropout(p=dropout, inplace=False)

    def forward(self, x):
        x, (_, _) = self.lstm1(x)
        x, lens = pad_packed_sequence(x, batch_first=True, total_length=self.seq_len)
        x = self.drop1(x)
        x = pack_padded_sequence(x, lens, batch_first=True, enforce_sorted=False)
        x, (hidden_n, _) = self.lstm2(x)

        return hidden_n.reshape((-1, self.n_features, self.embedding_dim)), lens

Alternative, possibly better, but currently not working solution;

class Encoder2(nn.Module):

    def __init__(self, seq_len, n_features, embedding_dim, hidden_dim, dropout):
        super(Encoder2, self).__init__()

        self.seq_len = seq_len
        self.n_features = n_features
        self.embedding_dim = embedding_dim
        self.hidden_dim = hidden_dim

        self.lstm1 = nn.LSTM(
            input_size=n_features,
            hidden_size=self.hidden_dim,
            num_layers=2,
            batch_first=True,
            dropout=dropout,
            proj_size=self.embedding_dim,
        )

    def forward(self, x):
        _, (h_n, _) = self.lstm1(x)
        return h_n[-1].unsqueeze(1), lens

Any help and tips about working with time-series, packed sequences, lstm-cells and dropout would be immensely appreciated, as I'm not finding much documentation/guidance elsewhere on the internet. Thank you!

Best, Lars Ankile

Solution

For the hereafter, after a lot of trial and error, the following full code for the Autoencoder seems to work very well. Getting the packing and unpacking to work correctly was the main hurdle. The clue is, I think, to try to utilize the LSTM modules for what they're worth by using the proj_size, num_layers, and dropout parameters.

class EncoderV4(nn.Module):
    def __init__(
        self, seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
    ):
        super().__init__()

        self.seq_len = seq_len
        self.n_features = n_features
        self.embedding_dim = embedding_dim
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers

        self.lstm1 = nn.LSTM(
            input_size=n_features,
            hidden_size=self.hidden_dim,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout,
            proj_size=self.embedding_dim,
        )

    def forward(self, x):
        _, (h_n, _) = self.lstm1(x)
        return h_n[-1].unsqueeze(1)


class DecoderV4(nn.Module):
    def __init__(self, seq_len, input_dim, hidden_dim, n_features, num_layers):
        super().__init__()

        self.seq_len = seq_len
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.n_features = n_features
        self.num_layers = num_layers

        self.lstm1 = nn.LSTM(
            input_size=input_dim,
            hidden_size=hidden_dim,
            num_layers=num_layers,
            proj_size=n_features,
            batch_first=True,
        )

    def forward(self, x, lens):
        x = x.repeat(1, self.seq_len, 1)
        x = pack_padded_sequence(x, lens, batch_first=True, enforce_sorted=False)
        x, _ = self.lstm1(x)

        return x


class RecurrentAutoencoderV4(nn.Module):
    def __init__(
        self, seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
    ):
        super().__init__()

        self.encoder = EncoderV4(
            seq_len, n_features, embedding_dim, hidden_dim, dropout, num_layers
        )
        self.decoder = DecoderV4(
            seq_len, embedding_dim, hidden_dim, n_features, num_layers
        )

    def forward(self, x, lens):

        x = self.encoder(x)
        x = self.decoder(x, lens)

        return x

The full code and a paper using this Autoencoder can be found at GitHub and arXiv, respectively.