Search code examples
pythonpytorchneural-network

Changing the number of hidden layers in my NN results in an error


As the title says, if I change the number of hidden layers in my pytorch neural network to be anything different from the amount of input nodes it returns the error below.

RuntimeError: mat1 and mat2 shapes cannot be multiplied (380x10 and 2x10)

I think that the architecture is incorrectly coded but I am relatively new to pytorch and neural networks so I can't spot the mistake. Any help is greatly appreciated, I've included the code below

class FCN(nn.Module):

def __init__(self, N_INPUT, N_OUTPUT, N_HIDDEN, N_LAYERS):
    super().__init__()
    activation = nn.Tanh
    self.fcs = nn.Sequential(*[
        nn.Linear(N_INPUT, N_HIDDEN),
        activation()])
    self.fch = nn.Sequential(*[
                  nn.Sequential(*[
                      nn.Linear(N_INPUT, N_HIDDEN),
                      activation()]) for _ in range(N_LAYERS-1)])
    self.fce = nn.Linear(N_INPUT, N_HIDDEN)

def forward(self, x):

    x = self.fcs(x)
    x = self.fch(x)
    x = self.fce(x)
    
    return x


torch.manual_seed(123)

pinn = FCN(2, 2, 10, 8)

If the pinn architecture is defined as pinn = FCN(2, 2, 2, 8) no errors are returned but neural network does not perform well.

Other information:

  • the input is a matrix tensor with a batch size of 380

Please let me know if you need anymore information and thank you!


Solution

  • The error you're getting is because the output of your first layer (fcs) has dimension N_HIDDEN (which is 10), while the hidden layers in fch have input dimension N_INPUT (which is 2).

    To fix this, you have to ensure that the input size for all layers matches the output size of the previous layer. In your code:

    class FCN(nn.Module):
        def __init__(self, N_INPUT, N_OUTPUT, N_HIDDEN, N_LAYERS):
            super().__init__()
            activation = nn.Tanh
            self.fcs = nn.Sequential(
                nn.Linear(N_INPUT, N_HIDDEN),
                activation()
            )
            self.fch = nn.Sequential(*[
                nn.Sequential(
                    nn.Linear(N_HIDDEN, N_HIDDEN),  # Adjust input size to N_HIDDEN
                    activation()
                ) for _ in range(N_LAYERS - 1)
            ])
            self.fce = nn.Linear(N_HIDDEN, N_OUTPUT)  # Output layer
    
        def forward(self, x):
            x = self.fcs(x)
            x = self.fch(x)
            x = self.fce(x)
            return x
    

    Finally, to get good performance you should play with the hidden size (not just between 2 and 10, you can also try 100 or 1000), the number of layers (start with 1 or 2, not 8) and the learning rate of the optimizer.