Search code examples
c++opencvocrtesseractraspberry-pi3

cv::text::OCRTesseract not respecting filters on Raspberry Pi


I am using the OCRTesseract extra module in openCV for text recognition on a raspberry pi model 3. I want it to only detect single, uppercase characters. The following initilization code works perfectly fine on my desktop and laptop:

Ptr<OCRTesseract> tess;
tess = OCRTesseract::create(NULL, NULL, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 3, 10);

however when run on the run on the raspberry pi it seems to ignore the filters and will often give lowercase characters and symbols. Occasionally giving multiple characters at the same time.
I have tried:

tess->setWhiteList("ABCDEFGHIJKLMNOPQRSTUVWXYZ");

to no avail.

Any suggestions? The OCR works fine apart from this issue. Allowing it to detect lowercase letters/symbols is resulting in a lot more false positives than I am happy with.


Solution

  • After doing some research I found it to be a bug in tesseract version 4.00.00alpha (which just so happened to be the version I was running on the PI).
    Setting the OEM mode to 0 completely fixed the issue. The following initilization code works as intended:

    Ptr<OCRTesseract> tess;
    tess = OCRTesseract::create(NULL, NULL, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0, 10);