I am trying to build a multi layer neural network. I have train data with shape:
train[0][0].shape
(4096,)
Below is my dense layer
from collections import OrderedDict
n_out = 8
net = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(4096, 1366)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(1366, 456)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(456, 100)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
I am using crossentropy as the loss function. The problem I have is when I train the model with the below code:
learning_rate = 0.001
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)
n_epochs = 40
for epoch in range(n_epochs):
for snds, labels in final_train_loader:
outputs = net(snds.view(snds.shape[0], -1))
loss = loss_fn(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch: %d, Loss: %f" % (epoch, float(loss)))
The error I receive is the matrix multiplication error.
RuntimeError: mat1 and mat2 shapes cannot be multiplied (100x4096 and 456x100)
I have the dimensions wrong but cannot figure out how to get it right.
The OrderedDict
contains three Linear
layers associated with the same key, hidden_layer
(the same happens with nn.Tanh
). In order to make it work you need to provide such layers with a different name:
inp = torch.rand(100, 4096)
net = nn.Sequential(OrderedDict([
('hidden_linear0', nn.Linear(4096, 1366)),
('hidden_activation0', nn.Tanh()),
('hidden_linear1', nn.Linear(1366, 456)),
('hidden_activation1', nn.Tanh()),
('hidden_linear2', nn.Linear(456, 100)),
('hidden_activation2', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
net(inp) # now it works!