Search code examples
pythondeep-learningpytorchnormalizetorchvision

Torchvision 0.2.1 transforms.Normalize does not work as expected


I am trying a new code using Pytorch. In this code, to load the dataset (CIFAR10), I am using torchvision's datasets. I define two transform functions ToTensor() and Normalize(). After normalize I expect the data in the dataset should be between 0 and 1. But the max value is still 255. I also inserted a print statement inside the '__call__' function of Normalize class in transforms.py (Lib\site-packages\torchvision\transforms\transforms.py). This print is not printed while running the code too. Not sure what is happening. Every page I visited in the internet, mentions the usage almost the same way as I do. For example some sites I visited https://github.com/adventuresinML/adventures-in-ml-code/blob/master/pytorch_nn.py https://github.com/pytorch/tutorials/blob/master/beginner_source/blitz/cifar10_tutorial.py

My code is given below. This reads the dataset with and without Normalize, then prints some stats. The min and max printed is an indicator of whether the data is normalized or not.

import torchvision as tv
import numpy as np

dataDir = 'D:\\general\\ML_DL\\datasets\\CIFAR'

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor()])
trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)

trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
trainSet        = tv.datasets.CIFAR10(dataDir, train=True, download=False, transform=trainTransform)
print (trainSet.train_data.mean(axis=(0,1,2))/255)
print (trainSet.train_data.min())
print (trainSet.train_data.max())
print (trainSet.train_data.shape)

The output looks like,

[ 0.49139968  0.48215841  0.44653091]
0
255
(50000, 32, 32, 3)
[ 0.49139968  0.48215841  0.44653091]
0
255
(50000, 32, 32, 3)

Please help me understand this better. As most of the functions I tried, ends up with similar results - for example Grayscale, CenterCrop too.


Solution

  • So, in the code you have laid out a plan that how you want to process your data. You have created a data pipeline through which your data will flow and multiple transforms will be applied.

    However, You forgot to call torch.utils.data.DataLoader. Until this is called, transformations on your data won't be applied. You can read more about it here.

    Now when we add the above to your code like the following -

    trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), 
                      tv.transforms.Normalize((0.4914, 0.4822, 0.4466), (0.247, 0.243, 0.261))])
    
    trainSet = tv.datasets.CIFAR10(root=dataDir, train=True,
                                        download=False, transform=trainTransform)
    
    dataloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=False, num_workers=4)
    

    and printed images like the following -

    images, labels = iter(dataloader).next()
    print images
    print images.max()
    print images.min()
    

    We get Tensors having transformations we have applied.

    A small snippet of the output

    [[ 1.8649,  1.8198,  1.8348,  ...,  0.3924,  0.3774,  0.2572],
          [ 1.9701,  1.9550,  1.9851,  ...,  0.7230,  0.6929,  0.6629],
          [ 2.0001,  1.9550,  2.0001,  ...,  0.7831,  0.7530,  0.7079],
          ...,
          [-0.8096, -1.0049, -1.0350,  ..., -1.3355, -1.3655, -1.4256],
          [-0.7796, -0.8697, -0.9749,  ..., -1.2754, -1.4557, -1.5609],
          [-0.7645, -0.7946, -0.9298,  ..., -1.4106, -1.5308, -1.5909]]]])
    tensor(2.1309)
    tensor(-1.9895) 
    

    Secondly, transforms.Normalize(mean,std) applies input[channel] = (input[channel] - mean[channel]) / std[channel] so according to the mean and standard deviation we are providing we can't get values after transformation in the range of (0,1). In case you want values between (-1,1) you can use the following -

    trainTransform  = tv.transforms.Compose([tv.transforms.ToTensor(), 
                      tv.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
    

    I hope it helps! :)