I'm learning image classification with Pytorch. I found some papers code use 'CenterCrop' to both train set and test set,e.g. Resize to larger size,then apply CenterCrop to obtain a smaller size. The smaller size is a general size in this research direction.
In my experience, I found apply CenterCrop can get a significant improvement(e.g. 1% or 2%) on test, compare to without CenterCrop on test set.
Because it is used in the top conference papers, confused me. So, Should CenterCrop be used to test set this count as cheating? In addition, should I use any data augmentation to test set except 'Resize' and 'Normalization'?
Thank you for your answer.
That is not cheating. You can apply any augmentation as long as the label is not used.
In image classification, sometimes people use a FiveCrop+Reflection technique, which is to take five crops (Center, TopLeft, TopRight, BottomLeft, BottomRight) and their reflections as augmentations. They would then predict class probabilities for each crop and average the results, typically giving some performance boost with 10X running time.
In segmentation, people also use similar test-time augmentation "multi-scale testing" which is to resize the input image to different scales before feeding it to the network. The predictions are also averaged.
If you do use such kind of augmentation, do report them when you compare with other methods for a fair comparison.