How can I define regions on an image, then pass this data to Tesseract-OCR's command-line, so that only text within the defined regions would be extracted?
I'm guessing this may be similar to the use of an image-map in HTML.
Thank in advance for your responses.
I found out how to pass in regions on an image to Tesseract.
Although it cannot be done through the command line, the Tesseract 3.02 API supports the function SetRectangle(int left, int top, int width, int height)
that allows you to restrict the text extraction to the region specified.
It must be called after the SetImage()
function.
Thanks again.