Search code examples
pytorchtorchtensortorchvision

Combination of torch.cat and torchvision.transforms leeds to zero tensor


I want to add some more information to an image in a fourth layer of a tensor which first three layers are based on an image. Afterwards I want to cut a peace out of an image (data augmentation) and have to resize the image to a given size.

For this I created a tensor from a picture and joined it with a tensor with one layer of additional information using torch.cat. (Almost but not all entries of the second tensor have been zeros.)

I sent the result through a transforms.compose (to cut and resize the tensor) But after that the tensor completely consisted out of zeros.

Here I have built a reproducible example.

import torch
from torchvision import transforms

height = 2
width = 4
resize = 2
tensor3 = torch.rand(3,height,width)
tensor1 = torch.zeros(1,height,width)
#tensor1 = torch.rand(1,height,width)

imageToTensor = transforms.ToTensor()
tensorToImage = transforms.ToPILImage()

train_transform = transforms.Compose([
    transforms.RandomResizedCrop(resize, scale=(0.9, 1.0)),
    transforms.ToTensor(),
])

tensor4 = torch.cat((tensor3,tensor1),0)

image4 = tensorToImage(tensor4)
transformed_image4 = train_transform(image4)

print(tensor4)
print(transformed_image4)
tensor([[[0.6774, 0.5293, 0.4420, 0.2463],
         [0.1391, 0.7481, 0.3436, 0.9391]],

        [[0.0652, 0.2061, 0.2931, 0.6126],
         [0.2618, 0.3506, 0.5095, 0.7351]],

        [[0.8555, 0.6320, 0.9461, 0.0928],
         [0.2094, 0.3944, 0.0528, 0.7900]],

        [[0.0000, 0.0000, 0.0000, 0.0000],
         [0.0000, 0.0000, 0.0000, 0.0000]]])

tensor([[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]])

If I choose “tensor1 = torch.rand(1,height,width)” I do not have this problem. But if most entries are zero I have. With scale=(0.5, 1.0) I don't have the problem either.

No some questions:

  1. How may I get the first three layers just resized with non zero entries?

  2. Did I misunderstand something, or is it really weird?


Solution

  • I created an issue:

    https://github.com/pytorch/pytorch/issues/22611

    And the answer was that only PIL-Images are supported in Torchvision.

    An alternative is the albumentations-library for transformations.