I am a little bit confused about the data augmentation. If I perform data augmentation in train dataset, validation dataset should have the same operations? For example
data_transforms = {
'train': transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
'val': transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),}
Why do we take the 'resize' and 'CenterCrop' operations in 'val' dataset?
Since validation data is used to measure how good a trained model is, it should not be changed across different trained models. That is, we should use a fixed measure to evaluate things. This is the reason why the augmentation of validation data does not contain any randomness, which exist in the training data augmentation.
SIDE NOTE:
Unlike test data, validation data is used to tune the hyper parameters.