Search code examples
csvtesseractubuntu-17.10

Tesseract tsv output not working


I'm trying to execute tesseract from command line in Ubuntu 17.10. I want the output in a .tsv file because I need the confidence rate. As explained here, I execute:

tesseract testing_img.png out tsv

but I'm getting the following error:

read_params_file: Can't open tsv
Tesseract Open Source OCR Engine v3.05.00 with Leptonica

and the output is written correctly in a out.txt file. It seems that it reads that tsv parameter as a file to read, but I don't know why.

I've compiled Tesseract from source because I need Tesseract 3.05 in order to have a .tsv file as output, so I can't use the version in the Ubuntu repository because it has Tesseract 3.04.

I'm running Ubuntu 17.10.

Here's some information about my Tesseract installation:

$ tesseract --version
    tesseract 3.05.00
     leptonica-1.75.3
      libpng 1.6.34 : zlib 1.2.11

$ ls /usr/share/tesseract-ocr/tessdata/
    configs  eng.traineddata  ita.traineddata  osd.traineddata  pdf.ttf  tessconfigs

$ echo $TESSDATA_PREFIX
    /usr/share/tesseract-ocr/

Solution

  • I had the same problem, in my case a file called tsv in the directory

    /usr/share/tesseract-ocr/tessdata/configs
    

    is missing. I downloaded the source code of tesseract from:

    https://github.com/tesseract-ocr/tesseract/archive/3.05.00.tar.gz

    and replaced the content of the folder configs.