In Pytorch, I know that certain image processing transformations can be composed as such:
import torchvision.transforms as transforms
transform = transforms.Compose([transforms.ToTensor()
,
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
In my case, each image has a corresponding annotation of bounding box coordinates with YOLO format. Does Pytorch allow to apply these transformations to the bounding box coordinates of the image as well, and later save them as new annotations? Thanks.
The transformations that you used as examples do not change the bounding box coordinates. ToTensor()
converts a PIL image to a torch tensor and Normalize()
is used to normalize the channels of the image.
Transformations such as RandomCrop()
and RandomRotation()
will cause a mismatch between the location of the bounding box and the (modified) image.
However, Pytorch makes it very flexible for you to create your own transformations and have control over what happens with the bounding box coordinates.
Docs for more details: https://pytorch.org/docs/stable/torchvision/transforms.html#functional-transforms
As an example (modified from the documentation):
import torchvision.transforms.functional as TF
import random
def my_rotation(image, bonding_box_coordinate):
if random.random() > 0.5:
angle = random.randint(-30, 30)
image = TF.rotate(image, angle)
bonding_box_coordinate = TF.rotate(bonding_box_coordinate, angle)
# more transforms ...
return image, bonding_box_coordinate
Hope that helps =)