I am new to deep learning and cnn and trying to get familiar with that field using CIFAR10 tutorial code from PyTorch website. So, in that code I was playing with removing/adding layers to better understand the effect of them and I tried to connect the input(which is the initial data with the batch of 4 images) to the output directly with using only single fully connected layer. I know that does not make much sense, but I do it only for the sake of experiment. So, when I tried to do it, I faced with some errors, which are as follows:
First, here is the code snippet:
########################################################################
# 2. Define a Convolution Neural Network
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Copy the neural network from the Neural Networks section before and modify it to
# take 3-channel images (instead of 1-channel images as it was defined).
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
#self.conv1 = nn.Conv2d(3, 6, 5)
#self.pool = nn.MaxPool2d(2, 2)
#self.conv2 = nn.Conv2d(6, 16, 5)
#self.fc1 = nn.Linear(16 * 5 * 5, 120)
#self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(768 * 4 * 4, 10)
def forward(self, x):
#x = self.pool(F.relu(self.conv1(x)))
#x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 768 * 4 * 4)
#x = F.relu(self.fc1(x))
#x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
#######################################################################
# 3. Define a Loss function and optimizer
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Let's use a Classification Cross-Entropy loss and SGD with momentum.
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
########################################################################
# 4. Train the network
# ^^^^^^^^^^^^^^^^^^^^
#
# This is when things start to get interesting.
# We simply have to loop over our data iterator, and feed the inputs to the
# network and optimize.
for epoch in range(4): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
print(len(outputs))
print(len(labels))
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
So, when I run the code, I get the following error:
Traceback (most recent call last):
File "C:\Users\Andrey\Desktop\Machine_learning_Danila\Homework 3\cifar10_tutorial1.py", line 180, in <module>
loss = criterion(outputs, labels)
File "C:\Program Files\Python36\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\torch\nn\modules\loss.py", line 862, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "C:\Program Files\Python36\lib\site-packages\torch\nn\functional.py", line 1550, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "C:\Program Files\Python36\lib\site-packages\torch\nn\functional.py", line 1405, in nll_loss
.format(input.size(0), target.size(0)))
ValueError: Expected input batch_size (1) to match target batch_size (4).
I was trying to check the length of x and it turns out, that it is 4 initially but it becomes 1 after the line
x = x.view(-1, 768 * 4 * 4)
I think my numbers are correct, but it seems like I am having only 1 tensor instead of 4 as I supposed to have, and I feel like that is what causes that error. I am wondering, why is that and what is the best way to fix that? Also, what would be the best optimal number for output dimension output in nn.Linear(Fully connected Layer) in this case?
There are two obvious errors in your modified code (from the official ones from PyTorch webpage). First,
torch.nn.Linear(in_features, out_features)
is the correct syntax. But, you're passing 768 * 4 * 4
as in_features
. This is 4 times the actual number of neurons (pixels) in one CIFAR10 image (32*32*3 = 3072).
The second bug is related to the first one. When you prepare your inputs
tensor,
# forward + backward + optimize;
# `inputs` should be a tensor of shape [batch_size, input_shape]
outputs = net(inputs)
you should pass it as a tensor of shape [batch_size, input_size]
, which according to your requirement is [4, 3072]
since you want to use a batch size of 4. This is where you should provide the batch dimension; Not in nn.Linear
which is what you're currently doing and that is causing the error.
Finally, you should also fix the line in forward
method. Change the below line
x = x.view(-1, 768 * 4 * 4)
to
x = x.view(-1, 32*32*3)
Fixing these bugs should fix your errors.
Having said that I'm unsure whether this would actually work, in a conceptual sense. Because this is a simple linear transformation (i.e. an affine transformation without any non-linearity). The data points (which correspond to images in CIFAR10) would most probably be not linearly separable, in this 3072 dimensional space (manifold). Hence, the accuracy would be drastically poor. Thus it is advisable to add at least a hidden layer with non-linearity such as ReLU.