Saving bounding box coordinates for each frame in a video

I have a video from a camera with humans on the scene. I need to go through each frame of that video and manually save the coordinates (go through each frame and draw the square around each human) of the bounding box of the detected humans on the scene and the coordinate of the center of the head - so basically, top-left, bottom-right, head-center coordinates. The bounding box has to be a square.

An additional program will then read a file with coordinates of the square and center of head and the frame number, and extract the boxes as an image.

For anybody that has experience with computer vision - is there any open-source software that can accomplish what I am requesting? If not, what technology would you recommend building this tool on? Any starter code?

Solution

I don't know of any programs that can do specifically this, but I think it is an easy problem and you can code it yourself in no time.

As you are in the computer vision field you must be used to OpenCV. You can use it to extract the frames from a video and to select the box and head center.

Here are some links that can help you out:

Extract video frames

Detect mouse events