Are any of the current text capture APIs (e.g. Google's Text API) fast enough to capture text from a phone's video feed, and draw a box that stays on the text even as the camera moves?
I don't need fast enough to do full OCR per-frame (though that would be amazing!). I'm just looking for fast enough to recognize blocks of text and keep the bounding box displayed in sync with the live image.
There are two major options for good results. They are both C++ but there are wrappers. I've personally played with OpenCV for face recognition and the results were promising. Below links with small tutorials and demos.