Search code examples
pythonpytorchconv-neural-network

How to build CNN in Pytorch for RGB images?


I am building a CNN in Pytorch. Below is the code I would use for grayscale input images:

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    # 1x1x28x28 to 32x1x28x28
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=1)
    # 32x1x28x28 to 64x1x28x28
    self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
    # 64x1x28x28 to 64x1x14x14
    self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2) 
    # 64x1x14x14 to 128x1x14x14
    self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
    # 128x1x14x14 to 128x1x7x7
    self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
    # 128x1x7x7 to 128
    self.fc1 = nn.Linear(in_features=128*7*7, out_features=128)
    # 128 to 27 (no. of classes)
    self.fc2 = nn.Linear(in_features=128, out_features=27)

  def forward(self, x):
    x = F.relu(self.conv1(x))
    x = F.relu(self.conv2(x))
    x = self.pool1(x)
    x = F.relu(self.conv3(x))
    x = self.pool2(x)
    x = x.view(-1, 128*7*7)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    return x

Do I have to change any of this code to adapt it to RGB images? If so, what are these changes?

FYI:

  • The input images have the shape of 28x28
  • They are RGB (3 color channels)

Thank you so much.


Solution

  • CNN only accepts tensors of 4 dims. Such as your 1x1x28x28. If you really have identical problem (classification, num of classes, image size are identical), basically, only the number of channels changes from 1 to 3. Therefore input dims should be 1x3x28x28.

    Also, you can check out this two articles from Pytorch:

    1. https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

    2. https://pytorch.org/tutorials/recipes/recipes/defining_a_neural_network.html