I'm following the tesstrain readme at https://github.com/tesseract-ocr/tesstrain.
When I run make training
, I get the following error:
File not found - *.gt.txt
File not found - *.gt.txt
You are using make version: 4.4.1
Makefile:224: *** found no data/foo-ground-truth/*.gt.txt for data/foo/all-gt. Stop.
I don't understand this error, because I have triple-checked that the sample data (which includes many .gt.txt files) is in data/foo-ground-truth.
Here's what I've done so far that the readme says to do:
make
, wget
, find
, bash
, unzip
, bc
) and added them all to my PATH.make tesseract-langdata
. This successfully added a bunch of unicharset files to data/langdata.Any ideas why it might not be able to find the .gt.txt files that are in the proper directory? I've hit a wall on my troubleshooting.
I'm on Windows 10 and I have Make version 4.1.1. and Python version 3.11.5.
I found the answer. C:/Program Files/Git/usr/bin (which contains find.exe) needs to be first in the PATH, and I had added it to the top of my User PATH, which is listed after the System PATH. Once I added it to the top of my System PATH, everything worked.