Search code examples
pythonneural-networkpytorchconv-neural-network

How to make batch with pictures of different sizes for model in PyTorch?


I want to use GlobalAveragePooling in my PyTorch model and not to resize, crop or pad the image. I can train my model using only one image every iteration (not batch). But it is too slow and I don't know how to use several images of different sizes as one input for Model. Example of model code:

class GAPModel(nn.Module):
  def __init__(self):
    super().__init__()
    
    self.conv = nn.Sequential(
        nn.Conv2d(3, 16, kernel_size=3),
        nn.ReLU(inplace=True),
    )
    
    self.linear = nn.Sequential(
        nn.Linear(in_features=16, out_features=1),
        nn.ReLU(),
    )

  def forward(self, image):
    return self.linear(self.conv(image).mean([2, 3]))

Solution

  • One idea is to choose a image of the same size for each stack. Caution.

    1. shuffling: can group indexes/images_id by size and then apply shuffle within the group.
    2. last batch: just do something similar to drop_last if neccessary (see torch dataloader)
    3. maybe there is more work ...