I am struggle to use tesseract ocr in windows. Here is what I have installed: tesseract-ocr-w32-setup-v4.0.0-rc1.20181002.exe from here:
https://github.com/UB-Mannheim/tesseract/wiki
and I have installed on my machine. After that, I have setup the environment variable
but when I am trying to get text from image, with this command:
C:\Users\flaviu.marc>tesseract c:\Flaviu\imagine.png C:\Flaviu\output.txt
I get the following errors:
Error opening data file C:\Program Files (x86)\Tesseract-OCR\eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.
Can you help me to solve my problem ? I am trying to use tesseract into VC++ app, but I get exactly the same errors just like I use tesseract from command line.
After I updated the environment variable:
I get the following error:
C:\Users\flaviu.marc>tesseract c:\Flaviu\imagine.png C:\Flaviu\output.txt
Tesseract Open Source OCR Engine vv4.0.0-rc1.20181002 with Leptonica
Error in pixReadStreamPng: spp == 1, cmap, trans array, invalid depth: 4
Later edit: if I have tried another image, the initialization is working now, but I still have some error messages:
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
Why I encounter these errors ? Because when I try to run the classic code (pImage is NULL)
Pix* pImage = pixRead(sFileName);
if(NULL == pImage)
{
m_sError.Format(_T("Could not read image with leptonica."));
return sRet;
}
Code is taken from here: https://github.com/tesseract-ocr/tesseract/wiki/APIExample
Here is how I compiled leptonica:
how can compile libtiff ? I have no option for that ...
TESSDATA_PREFIX
should be pointing to the directory with traineddata files for example: