Search code examples
numpypytorchconv-neural-network

How to pass numpy image through PyTorch Conv2d


My question sounds like "How to pass numpy image through PyTorch Conv2d" What I mean is my numpy array of images has shape of (Amount of images x Height x Width x Dimension) And for convolution with nn.Conv2d() I need shape of (Batch size x Channel size x Height x Width) So, what should I do? Reshape numpy and create a Tensor of it?

I tried to reshape array of images with np.reshape but it didn't work out


Solution

  • Reshaping numpy array is not a good way to make your data into the desired format. However, it is better to convert it to tensor first and rearrange it with the transformation function provided in PyTorch instead

    To pass your numpy array of images into nn.Conv2d, as you said, what you have is (amount of images x height x width x dimension) that is your numpy image shape. Here is a code sample :

    # get the first image by indexing, suppose that numpy kept in variable called numpy_data
    numpy_img =  numpy_data[0] # get the first image as a sample
    

    now, numpy_img will have shape of (height, width, channel_size) since we index to gete the first image in those batch of numpy images, then, we will convert the numpy array into PyTorch tensor first, in order to adjust your numpy data into desired shape with transformation functions in PyTorch

    # convert numpy array to torch.Tensor
    tensor_image = torch.from_numpy(numpy_img)
    

    Next, to make channel size in front of height and width -> (channel_size, height, width)

    # formatting the tensor
    tensor_image = tensor_image.permute(2, 0, 1)
    

    Finally, to obtain (batch size x channel_size x height x width), the rest is to add batch size dimention in front, which can be done by torch.unsqueeze()

    tensor_image = tensor_image.unsqueeze(0)
    

    As a result, your tensor_image will end up as shape of (1 x channel_size x height x width) that is the format that can pass through nn.Conv2d

    Hope this help !