Search code examples
tesseract

Tesseract hocr character output


I am using Tesseract portable version 3.02 and would like to get the hocr output for character. The problem is hocr output only shows bouding box for words but not characters, if someone know if there is an option to change in tessdata/config that would do the trick please let me know. Otherwise let me know if there is another method to get around this. I am unable to install anything on the computer so I cannot use the Tesseract API method. Only dll files can be used.


Solution

  • I found box file which does the same thing, not necessarily in html format.