I have the following relative coordinates:
[[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
(however I don't understand, why are here 5 values instead of usual 4 and what they mean)
My attempt with scikit-image
that shows whole pic instead of cropping:
import numpy as np
from skimage import io, draw
img = io.imread(pic)
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
vertices = np.asarray(test_vals)
rows, cols = draw.polygon(vertices[:, 0], vertices[:, 1])
crop = img.copy()
crop[:, :, -1] = 0
crop[rows, cols, -1] = 255
io.imshow(crop)
io.show()
# shows whole pic instead of cropping
My attempt with opencv
gives errors because coordinates are in float format:
import cv2 as cv
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
x = vals[0][0]
y = vals[0][1]
width = vals[1][0] - x
height = vals[2][1] - y
img = cv.imread(pic)
crop_img = img[y:y+height, x:x+width]
cv.imshow("cropped", crop_img)
cv.waitKey(0)
# TypeError: slice indices must be integers or None or have an __index__ method
How to crop car number on this pic given its relative bbox coordinates?
I am not limited to any framework, so if you think that TF or anything else might help - please suggest.
Inspection of
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
shows that the first and the last entry in the list are identical. In image processing, the position (0,0) is the top left corner. Looking at the values in the list, you can assume that the coordinates are given as follows:
[top_left, bottom_left, bottom_right, top_right, top_left]
The fact that all numbers are between zero and 1 suggests that these are relative coordinates. To rescale back to image dimensions, they need to be multiplied by height and width, respectively:
# dummy img sizes:
image_height = 480
image_width = 640
# rescale to img dimensions, and convert to int, to allow slicing:
bbox_coordinates = [[int(a[0]*image_height), int(a[1]* image_width)] for a in vals]
now, you can use array slicing on the image to crop:
top_left = bbox_coordinates[0]
bottom_right = boox_coordinates[2]
bbox = img[top_left[0]:bottom_right[0], top_left[1]:bottom_right[1]]