Why am I getting the error ValueError: Expected input batch_size (4) to match target batch_size (64)
?
Is it something to do with an incorrect number of channels(?) in the first linear layer? In this example I have 128 *4 *4 as the channel.
I have tried looking online and on this site for the answer but I have not been able to find it. So, I asked here.
Here is the network:
class Net(nn.Module):
"""A representation of a convolutional neural network comprised of VGG blocks."""
def __init__(self, n_channels):
super(Net, self).__init__()
# VGG block 1
self.conv1 = nn.Conv2d(n_channels, 64, (3,3))
self.act1 = nn.ReLU()
self.pool1 = nn.MaxPool2d((2,2), stride=(2,2))
# VGG block 2
self.conv2 = nn.Conv2d(64, 64, (3,3))
self.act2 = nn.ReLU()
self.pool2 = nn.MaxPool2d((2,2), stride=(2,2))
# VGG block 3
self.conv3 = nn.Conv2d(64, 128, (3,3))
self.act3 = nn.ReLU()
self.pool3 = nn.MaxPool2d((2,2), stride=(2,2))
# Fully connected layer
self.f1 = nn.Linear(128 * 4 * 4, 1000)
self.act4 = nn.ReLU()
# Output layer
self.f2 = nn.Linear(1000, 10)
self.act5 = nn.Softmax(dim=1)
def forward(self, X):
"""This function forward propagates the input."""
# VGG block 1
X = self.conv1(X)
X = self.act1(X)
X = self.pool1(X)
# VGG block 2
X = self.conv2(X)
X = self.act2(X)
X = self.pool2(X)
# VGG block 3
X = self.conv3(X)
X = self.act3(X)
X = self.pool3(X)
# Flatten
X = X.view(-1, 128 * 4 * 4)
# Fully connected layer
X = self.f1(X)
X = self.act4(X)
# Output layer
X = self.f2(X)
X = self.act5(X)
return X
Here is the training loop:
def training_loop(
n_epochs,
optimizer,
model,
loss_fn,
train_loader):
for epoch in range(1, n_epochs + 1):
loss_train = 0.0
for i, (imgs, labels) in enumerate(train_loader):
outputs = model(imgs)
loss = loss_fn(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
loss_train += loss.item()
if epoch == 1 or epoch % 10 == 0:
print('{} Epoch {}, Training loss {}'.format(
datetime.datetime.now(),
epoch,
loss_train / len(train_loader)))
That's because you're getting the dimensions wrong. From the error and your comment, I take it that your input is of the shape (64, 1, 28, 28)
.
Now, the shape of X
at X = self.pool3(X)
is (64, 128, 1, 1)
, which you then reshaped on the next line to (4, 128 * 4 * 4)
.
Long story short, the output of your model is (4, 10)
i.e batch_size (4)
, which you're comparing on this line loss = loss_fn(outputs, labels)
with a tensor of batch_size (64)
as the error said.
I don't know what you're trying to do but I'm guessing that you'd want to change this line self.f1 = nn.Linear(128 * 4 * 4, 1000)
to this self.f1 = nn.Linear(128 * 1 * 1, 1000)