I recently changed my computer from a PC running Ubuntu 16.04 to a MacBook Pro with Mac Os X 10.12.6. I'm working on a program using tesseract (pytesseract 0.1.7) and opencv 3.3.0 for automatic text extraction on Id cards. The problem that i'm facing right now is that my program doesn't work properly, the OCR is completely false on my MacBook and i don't get why. I'd like to know what i should do to make it work on MacBook Pro the same way it works on Ubuntu
configuration:
Ubuntu 16.04: tesseract was build from source
$ tesseract --version
tesseract cf0b378
leptonica -1.74.1
libjpeg 8d (libjpeg-turbo 1.4.2): libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8
MacBook os X 10.12.6 : tesseract installed via Homebrew
$ tesseract --version:
tesseract 3.05.01
leptonica-1.74.4
libjpeg 9b : libpng 1.6.32 : libtiff 4.0.8 : zlib 1.2.8
By running this command tesseract image.jpg stdout
with tesseract cf0b378 i get : Gabo / M
with tesseract 3.05.01 i get : GM"
I solved this by building tesseract with --HEAD option.
brew update
brew install tesseract --HEAD
Now i have tesseract 4.00.00alpha and works perfectly fine.
Also, i just found this answer here : https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/tesseract-ocr/rdaG14IDVu8/RtihYxlOAQAJ