image video machine-learning video-capture

Detecting and classifying text from a video

I'm trying to work on the ICDAR2015 dataset, which is a text detection and classification from video files problem. I have worked on text detection and classification problems on static images before but never before have I worked on video data.

Is there some library/tool which will help me snapshot images of different frames from the video? Thank You.

Solution

So long as the video is not encrypted, there are quite a few ways to screengrab frames depending on the platform you are using.

Given your problem domain and your experience with the domain, OpenCV an open source computer vision library is probably a good match:

http://opencv.org

The doucmention includes examples to capture video frames:

http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_video_display/py_video_display.html

For example from the above tutorial, to read video from a file:

import numpy as np
import cv2

cap = cv2.VideoCapture('vtest.avi')

while(cap.isOpened()):
    ret, frame = cap.read()

    //Do whatever work you want on the frame here - in this example
    //from the tutorial the image is being converted from one colour 
    //space to another
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    //This displays the resulting frame - you may or may not not need 
    //this for your case
    cv2.imshow('frame',gray)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()