Search code examples
c++visual-c++tesseract

Using Tesseract 4 in Windows


I am struggle to use tesseract ocr in windows. Here is what I have installed: tesseract-ocr-w32-setup-v4.0.0-rc1.20181002.exe from here:

https://github.com/UB-Mannheim/tesseract/wiki

enter image description here

and I have installed on my machine. After that, I have setup the environment variable

enter image description here

but when I am trying to get text from image, with this command:

C:\Users\flaviu.marc>tesseract c:\Flaviu\imagine.png C:\Flaviu\output.txt

I get the following errors:

Error opening data file C:\Program Files (x86)\Tesseract-OCR\eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

Can you help me to solve my problem ? I am trying to use tesseract into VC++ app, but I get exactly the same errors just like I use tesseract from command line.

After I updated the environment variable:

enter image description here

I get the following error:

C:\Users\flaviu.marc>tesseract c:\Flaviu\imagine.png C:\Flaviu\output.txt
Tesseract Open Source OCR Engine vv4.0.0-rc1.20181002 with Leptonica
Error in pixReadStreamPng: spp == 1, cmap, trans array, invalid depth: 4

Later edit: if I have tried another image, the initialization is working now, but I still have some error messages:

Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made

Why I encounter these errors ? Because when I try to run the classic code (pImage is NULL)

Pix* pImage = pixRead(sFileName);
if(NULL == pImage)
{
    m_sError.Format(_T("Could not read image with leptonica."));
    return sRet;
}

Code is taken from here: https://github.com/tesseract-ocr/tesseract/wiki/APIExample

Here is how I compiled leptonica: enter image description here

how can compile libtiff ? I have no option for that ...


Solution

  • TESSDATA_PREFIX should be pointing to the directory with traineddata files for example:

    tessdata default

    tessdata good quality but slow

    tessdata fast but lower quality