Search code examples
machine-learningdeep-learningpytorchdata-augmentationimage-augmentation

Will using image augmentation techniques in pytorch increase the dataset size on local machine also


I was training a custom model in pytorch and the dataset was very uneven. As in there are 10 classes for which some class have only 800 images while some have 4000 images. I found that image augmentation was a solution for my problem to avoid overfitting. But i got confused in between while implementing, the below codes were used to alter the features of the images

loader_transform = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(140),
transforms.RandomHorizontalFlip()
])

but while training it shows the same original number of images where did the newly created augmented dataset go. And if i want to save it on my local machine and to make all classes even what can be done??


Solution

  • It looks like you are using online augmentations, If you like to use offline please do a pre-processing step that saves the images and then use them in the training step

    Please make sure you understand the difference between online augmentations and offline augmentations

    Offline or pre-processing Augmentation

    To increase the size of the data set, enhancement is applied as a pre-processing step. Usually, we do this when we want to expand a small training data set. When applying to larger data sets, we have to consider disk space

    Online or real-time Augmentation

    The augmentation is being applied in real-time through random augmentations. Since the augmented images do not need to be saved on the disk, this method is usually applied to large data sets. At each epoch, the online augmentation model will see a different image.