Optimize the cropping function

I'm using the following code to crop an image and retrieve a non-rectangular patch.

def crop_image(img,roi):
    height = img.shape[0]
    width = img.shape[1]

    mask = np.zeros((height, width), dtype=np.uint8)
    points = np.array([roi])
    cv2.fillPoly(mask, points, (255))

    res = cv2.bitwise_and(img, img, mask=mask)

    rect = cv2.boundingRect(points)  # returns (x,y,w,h) of the rect
    cropped = res[rect[1]: rect[1] + rect[3], rect[0]: rect[0] + rect[2]]

    return cropped, res

The roi is [(1053, 969), (1149, 1071), (883, 1075), (813, 983)]. The above code works however How do I optimize the speed of the code? It is too slow. Is there any other better way of cropping non-rectangular patches?

Solution

I see two parts that could be optimized.

Cropping the image to the bounding rectangle bounds could be applied as the first step. Benefit? you dramatically reduce the size of the images you are working with. You only have to translate the points of the roi by the x,y of the rect and you are good to go.
At the bitwise_and operation, you are "anding" the image with itself and checking at each pixel whether it is allowed by the mask to output it. I guess this is where most time is spent. Instead, you can directly "and" with the mask and save your precious time (no extra mask checking step). Again, a minor tweak to be able to do so, the mask image should have exactly the same shape as the input image (including channels).

Edit: Modify code to support any number of channels in the input image

The code below does these two things:

def crop_image(img, roi):
height = img.shape[0]
width = img.shape[1]
channels = img.shape[2] if len(img.shape) > 2 else 1

points = np.array([roi])

rect = cv2.boundingRect(points)

mask_shape = (rect[3], rect[2]) if channels == 1 else (rect[3], rect[2], img.shape[2])

#Notice how the mask image size is now the size of the bounding rect
mask = np.zeros(mask_shape, dtype=np.uint8)

#tranlsate the points so that their origin is the bounding rect top left point
for p in points[0]:
    p[0] -= rect[0]
    p[1] -= rect[1]


mask_filling = tuple(255 for _ in range(channels))
cv2.fillPoly(mask, points, mask_filling)

cropped = img[rect[1]: rect[1] + rect[3], rect[0]: rect[0] + rect[2]]

res = cv2.bitwise_and(cropped, mask)

return cropped, res