Search code examples
data-augmentation

Data augmentation in validation


I am a little bit confused about the data augmentation. If I perform data augmentation in train dataset, validation dataset should have the same operations? For example

data_transforms = {
'train': transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
'val': transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),}

Why do we take the 'resize' and 'CenterCrop' operations in 'val' dataset?


Solution

  • Since validation data is used to measure how good a trained model is, it should not be changed across different trained models. That is, we should use a fixed measure to evaluate things. This is the reason why the augmentation of validation data does not contain any randomness, which exist in the training data augmentation.

    SIDE NOTE:

    Unlike test data, validation data is used to tune the hyper parameters.