Changing the number of hidden layers in my NN results in an error

As the title says, if I change the number of hidden layers in my pytorch neural network to be anything different from the amount of input nodes it returns the error below.

RuntimeError: mat1 and mat2 shapes cannot be multiplied (380x10 and 2x10)

I think that the architecture is incorrectly coded but I am relatively new to pytorch and neural networks so I can't spot the mistake. Any help is greatly appreciated, I've included the code below

class FCN(nn.Module):

def __init__(self, N_INPUT, N_OUTPUT, N_HIDDEN, N_LAYERS):
    super().__init__()
    activation = nn.Tanh
    self.fcs = nn.Sequential(*[
        nn.Linear(N_INPUT, N_HIDDEN),
        activation()])
    self.fch = nn.Sequential(*[
                  nn.Sequential(*[
                      nn.Linear(N_INPUT, N_HIDDEN),
                      activation()]) for _ in range(N_LAYERS-1)])
    self.fce = nn.Linear(N_INPUT, N_HIDDEN)

def forward(self, x):

    x = self.fcs(x)
    x = self.fch(x)
    x = self.fce(x)
    
    return x


torch.manual_seed(123)

pinn = FCN(2, 2, 10, 8)

If the pinn architecture is defined as pinn = FCN(2, 2, 2, 8) no errors are returned but neural network does not perform well.

Other information:

the input is a matrix tensor with a batch size of 380

Please let me know if you need anymore information and thank you!

Solution

The error you're getting is because the output of your first layer (fcs) has dimension N_HIDDEN (which is 10), while the hidden layers in fch have input dimension N_INPUT (which is 2).

To fix this, you have to ensure that the input size for all layers matches the output size of the previous layer. In your code:

class FCN(nn.Module):
    def __init__(self, N_INPUT, N_OUTPUT, N_HIDDEN, N_LAYERS):
        super().__init__()
        activation = nn.Tanh
        self.fcs = nn.Sequential(
            nn.Linear(N_INPUT, N_HIDDEN),
            activation()
        )
        self.fch = nn.Sequential(*[
            nn.Sequential(
                nn.Linear(N_HIDDEN, N_HIDDEN),  # Adjust input size to N_HIDDEN
                activation()
            ) for _ in range(N_LAYERS - 1)
        ])
        self.fce = nn.Linear(N_HIDDEN, N_OUTPUT)  # Output layer

    def forward(self, x):
        x = self.fcs(x)
        x = self.fch(x)
        x = self.fce(x)
        return x

Finally, to get good performance you should play with the hidden size (not just between 2 and 10, you can also try 100 or 1000), the number of layers (start with 1 or 2, not 8) and the learning rate of the optimizer.