deep-learning pytorch conv-neural-network bioinformatics

How to add additional layers to CNN model in PyTorch?

I have a question related to neural networks. I am a beginner in terms of specifying model parameters. I found this amazing example about DNA seq model built in PyTorch, which I want to improve. In the example, a basic CNN model was deployed and now I want to deploy a deeper model with more layers.

# basic CNN model
# These aren't optimized models, just something to start with, just testing pytorch with context of DNA
class DNA_CNN(nn.Module):
    def __init__(self,
                 seq_len,
                 num_filters=32,
                 kernel_size=3):
        super().__init__()
        self.seq_len = seq_len
        
        self.conv_net = nn.Sequential(
            # 4 is for the 4 nucleotides
            nn.Conv1d(4, num_filters, kernel_size=kernel_size),
            nn.ReLU(inplace=True),
            nn.Flatten(),
            nn.Linear(num_filters*(seq_len-kernel_size+1), 1)
        ) 

    def forward(self, xb):
        # reshape view to batch_size x 4channel x seq_len
        # permute to put channel in correct order
        xb = xb.permute(0,2,1) 
        
        #print(xb.shape)
        out = self.conv_net(xb)
        return out

Solution

Modular code to do so using padding same to keep the length of the sequence (by adding zeros in the borders before applying convolutions):

from typing import List
class DNA_CNN(nn.Module):
    def __init__(self,
                 seq_len: int,
                 num_filters: List[int] = [32, 64],
                 kernel_size: int = 3):
        super().__init__()
        self.seq_len = seq_len
        # CNN module
        self.conv_net = nn.Sequential()
        num_filters = [4] + num_filters
        for idx in range(len(num_filters) - 1):
            self.conv_net.add_module(
                f"conv_{idx}",
                nn.Conv1d(num_filters[idx], num_filters[idx + 1],
                          kernel_size=kernel_size, padding='same')
            )
            self.conv_net.add_module(f"relu_{idx}", nn.ReLU(inplace=True))
        self.conv_net.add_module("flatten", nn.Flatten())
        self.conv_net.add_module(
            "linear",
            nn.Linear(num_filters[-1]*seq_len, 1)
        )
        
    def forward(self, xb: torch.Tensor):
        """Forward pass."""
        xb = xb.permute(0, 2, 1) 
        out = self.conv_net(xb)
        return out

To change the kernel size, you can pass a list to kernel_size and simply use kernel_size=kernel_size[idx] in the convolution.

If for some reasons you want to remove the padding you can remove padding='same' in convolution and change the Linear definition to match the new shape:

nn.Linear(num_filters[-1] * (seq_len - (len(num_filters)-1) * (kernel_size-1), 1)