1 - I downloaded tesdata on the git hub https://github.com/tesseract-ocr/tessdata
2 - Extract the folder and passed the path to the Tesseract class
3 - When running the application the following error is displayed
Extract the folder and passed the path to the Tesseract class
When running the application the following error is displayed
Code snippet executed
public class TesseractOcrTest {
private final String tesseractPath = "/home/tessdata/";
@Test
public void shouldReturnTrueIfRunOcrEquals() throws Exception {
String result = new TesseractOcr(tesseractPath).runOcr("bw_HighResolution_en.jpeg").trim();
assertEquals(
"Optical Character Recognition in Java\nis made easy with the help of Tesseract", result);
}
}
Error
Error: Illegal Parameter specification!
"Fatal error encountered!" == NULL:Error:Assert failed:in file globaloc.cpp, line 75
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f52a582d69b, pid=8957, tid=8966
#
# JRE version: OpenJDK Runtime Environment (11.0.7+10) (build 11.0.7+10-post-Ubuntu-2ubuntu218.04)
# Java VM: OpenJDK 64-Bit Server VM (11.0.7+10-post-Ubuntu-2ubuntu218.04, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C [libtesseract.so.4+0x25969b] ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+0x16b
Note: When I change the path from tesdata to the OS installation path (private final String tesseractPath = "/usr/share/tesseract-ocr/4.00/tessdata/";
) I can do it perfectly. It just doesn't work if I point to the tesdata downloaded from the git hub.
What am I doing wrong? When downloaded from github do you need to do any more configuration?
You've probably used incompatible language data. For the current Tesseract verion, use tessdata_best
or tessdata_fast
, which comes with Linux distros. (You can verify by checking the file size.)