How to convert a CNN LSTM form keras to pytorch

I am trying to convert a CNN LSTM for keras to pytorch but I have trouble.

ConvNN_model = models.Sequential()
ConvNN_model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 1)))
ConvNN_model.add(layers.MaxPooling2D((2, 2)))
ConvNN_model.add(layers.Conv2D(64, (3, 3), activation='relu'))
ConvNN_model.add(TimeDistributed(LSTM(128, activation='relu')))
ConvNN_model.add(Dropout(0.2))
ConvNN_model.add(LSTM(128, activation='relu'))
ConvNN_model.add(layers.Dense(64, activation='relu'))
ConvNN_model.add(layers.Dropout(0.25))
ConvNN_model.add(layers.Dense(15, activation='softmax'))

How to convert the above code from Keras to Pytorch?

Solution

This is your CNN in Keras:

ConvNN_model = models.Sequential()
ConvNN_model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 1)))
ConvNN_model.add(layers.MaxPooling2D((2, 2)))
ConvNN_model.add(layers.Conv2D(64, (3, 3), activation='relu'))
ConvNN_model.add(TimeDistributed(LSTM(128, activation='relu')))
ConvNN_model.add(Dropout(0.2))
ConvNN_model.add(LSTM(128, activation='relu'))
ConvNN_model.add(layers.Dense(64, activation='relu'))
ConvNN_model.add(layers.Dropout(0.25))
ConvNN_model.add(layers.Dense(15, activation='softmax'))

This is the equivalent code in PyTorch:

class ConvNN_model(nn.Module):
    def __init__(self):
        super(ConvNN_model, self).__init__()
        self.layers = nn.Sequential(
                         nn.Conv2d(1, 32, kernel_size=3),
                         nn.ReLU(),
                         nn.MaxPool2d((2, 2)),
                         nn.Conv2d(32, 64, kernel_size=3),
                         nn.ReLU(),
                         TimeDistributed(nn.LSTM(128, 128)),
                         nn.Dropout(0.2),
                         nn.LSTM(128, 128),
                         nn.ReLU(),
                         nn.Linear(128, 64),
                         nn.ReLU(),
                         nn.Dropout(0.25),
                         nn.Linear(64, 15),
                         nn.Softmax()
                         )
    def forward(self, x):
        return self.layers(x)

Keep in mind that there is no equivalent module for the TimeDistributed class in PyTorch, so you have to build it yourself. Here is one that you can use (from here):

class TimeDistributed(nn.Module):
    def __init__(self, module, batch_first=False):
        super(TimeDistributed, self).__init__()
        self.module = module
        self.batch_first = batch_first

    def forward(self, x):

        if len(x.size()) <= 2:
            return self.module(x)

        # Squash samples and timesteps into a single axis
        x_reshape = x.contiguous().view(-1, x.size(-1))  # (samples * timesteps, input_size)

        y = self.module(x_reshape)

        # We have to reshape Y
        if self.batch_first:
            y = y.contiguous().view(x.size(0), -1, y.size(-1))  # (samples, timesteps, output_size)
        else:
            y = y.view(-1, x.size(1), y.size(-1))  # (timesteps, samples, output_size)

        return y

There are a million-and-one ways to skin a cat; you do not necessarily have to create the entire network in the nn.Sequential block as I did. Or if you wanted to stick to the sequential method to stay consistent with Keras, you don't need to subclass nn.Module and use the sequential layers altogether.