Search code examples
pythonopencvpython-mss

Using python mss to draw bounding box on top of screen record


I have code to screen record and with each frame I have a set of bounding boxes I want to display on each frame. I can do this using matplotlib or something but I mss is working at like 30fps and I need to be able to display the bounding boxes quickly.

I noticed in docs this example but I tried running it and can't get it to display anything. And I'm not even sure this would work with my example.

import cv2
import time
import numpy as np
from mss import mss

with mss() as sct:
        # Part of the screen to capture
        monitor = {"top": 79, "left": 265, "width": 905, "height": 586}

        while "Screen capturing":
            last_time = time.time()

            # Get raw pixels from the screen, save it to a Numpy array
            screen = np.array(sct.grab(monitor))

            # print("fps: {}".format(1 / (time.time() - last_time)))

            print('loop took {} seconds'.format(time.time()-last_time))
            last_time = time.time()
            screen = cv2.cvtColor(screen, cv2.COLOR_BGR2RGB)
            screen = cv2.resize(screen, (224,224)).astype(np.float32)/255

            # Display the picture
            cv2.imshow("OpenCV/Numpy normal", screen)

            # Press "q" to quit
            if cv2.waitKey(25) & 0xFF == ord("q"):
                cv2.destroyAllWindows()
                break

Now say I have a set of bounding boxes to display on each frame, for example,

bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]

Can I alter the pixels in some way to display this. I suppose I could do it via opencv as well since that is what is displaying the screen ultimately.

EDIT: In reference to the comment about the bounding boxes, they are x1, y1, width, height, and are in the resized (224,224) image


Solution

  • There are some missing details:

    • What is the format of the bounding boxes: [x1, y1, x2, y2] or [x1, y1, width, height] or something else?
    • Are the bounding boxes values in the resized (224, 224) or original range?

    Anyway, you can use the function below to draw the rectangles (you need to choose based on the format):

    def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
        for bbox in bboxes:
            # if [x1, y1, x2, y2]
            cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[-2:]), color, thickness)
            # if [x1, y1, width, height]
            cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)
    

    Assuming you defined your bboxes, you can call the function:

    • If you want to draw on the original frame:
    # [...]
    screen = np.array(sct.grab(monitor))
    draw_bboxes(screen, bboxes)
    # [...]
    
    • If you to draw on the resized frame:
    # [...]
    screen = cv2.resize(screen, (224,224)).astype(np.float32)/255
    draw_bboxes(screen, bboxes)
    # [...]
    

    With some changes, the full-code would look like this:

    import cv2
    import time
    import numpy as np
    from mss import mss
    
    def draw_bboxes(img, bboxes, color=(0, 0, 255), thickness=1):
        for bbox in bboxes:
            cv2.rectangle(img, tuple(bbox[:2]), tuple(bbox[:2]+bbox[-2:]), color, thickness)
    
    # bounding boxes
    bboxes = [np.array([12, 16, 29, 25]), np.array([5,  5, 38, 35])]
    
    with mss() as sct:
        # part of the screen to capture
        monitor = {"top": 79, "left": 265, "width": 905, "height": 586}
        while "Screen capturing":
            # get screen
            last_time = time.time()
            screen = np.asarray(sct.grab(monitor))
            print('loop took {} seconds'.format(time.time()-last_time))
    
            # convert from BGRA --> BGR
            screen = cv2.cvtColor(screen, cv2.COLOR_BGRA2BGR)
            # resize and draw bboxes
            screen = cv2.resize(screen, (224,224))
            draw_bboxes(screen, bboxes)
    
            # display
            cv2.imshow("OpenCV/Numpy normal", screen)
    
            # Press "q" to quit
            if cv2.waitKey(25) & 0xFF == ord("q"):
                cv2.destroyAllWindows()
                break
    

    The output would be something like this:

    enter image description here