Search code examples

Cannot figure out dense layers dimensions to run the neural network

I am trying to build a multi layer neural network. I have train data with shape:


Below is my dense layer

from collections import OrderedDict
n_out = 8
net = nn.Sequential(OrderedDict([
                            ('hidden_linear', nn.Linear(4096, 1366)),
                            ('hidden_activation', nn.Tanh()),
                            ('hidden_linear', nn.Linear(1366, 456)),
                            ('hidden_activation', nn.Tanh()),
                            ('hidden_linear', nn.Linear(456, 100)),
                            ('hidden_activation', nn.Tanh()), 
                            ('output_linear', nn.Linear(100, n_out))

I am using crossentropy as the loss function. The problem I have is when I train the model with the below code:

 learning_rate = 0.001
 optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)
 n_epochs = 40

for epoch in range(n_epochs):
    for snds, labels in final_train_loader:
         outputs = net(snds.view(snds.shape[0], -1))
         loss = loss_fn(outputs, labels)


     print("Epoch: %d, Loss: %f" % (epoch, float(loss)))

The error I receive is the matrix multiplication error.

 RuntimeError: mat1 and mat2 shapes cannot be multiplied (100x4096 and 456x100)

I have the dimensions wrong but cannot figure out how to get it right.


  • The OrderedDict contains three Linear layers associated with the same key, hidden_layer (the same happens with nn.Tanh). In order to make it work you need to provide such layers with a different name:

    inp = torch.rand(100, 4096)
    net = nn.Sequential(OrderedDict([
                                ('hidden_linear0', nn.Linear(4096, 1366)),
                                ('hidden_activation0', nn.Tanh()),
                                ('hidden_linear1', nn.Linear(1366, 456)),
                                ('hidden_activation1', nn.Tanh()),
                                ('hidden_linear2', nn.Linear(456, 100)),
                                ('hidden_activation2', nn.Tanh()), 
                                ('output_linear', nn.Linear(100, n_out))
    net(inp)  # now it works!