Search code examples
javaocrtesseracttess4j

Tesseract 4.5 return multiple results for same image structure


Hello everyone i have problem with tess4j with Arabic.trainddata the problem is when i get result two times the results were different like this the first output :

"| رقم القيد ? : 139\n" +

"18/02/2020 : ?التاريخ\n" +

"SYRIA H.O : ?الفرع?\n" +

the second output :

"رقم القيد ? : 439\n" +

"التاريخ :08/07/2020\n" +

"الفرع : ?SYRIA H.O?\n" +

the last raw is reverse and it could be for other raw in another output

please i need solution for make ocr always start read from RTL or to give me always the same result

and thank for all :)


Solution

  • Tesseract learns or adapts its results over successive runs. You'll need to clear its adaptive classifier or cache (via ClearAdaptiveClassifier, ClearPersistentCache​, or Clear method) to get the same result across subsequent runs.