Search code examples
phplaravelocrtesseract

Tesseract OCR for PHP blank page


It's returning a blank page. Using thiagoalessio Tesseract OCR for PHP.

Tesseract is installed on my Homestead VM:

vagrant@xxx-yyy-zzz:/usr/bin$ ./tesseract -v
tesseract 3.04.01

Blank page with:

use thiagoalessio\TesseractOCR\TesseractOCR;
    class OCRController extends Controller
    {
        public function analyze() {
            echo (new TesseractOCR(asset('storage/text.png')))
                ->executable('/usr/bin/tesseract')->run();
        }
    }

Debug PHP code:

use thiagoalessio\TesseractOCR\TesseractOCR;
class OCRController extends Controller
{
    public function analyze() {
        $tesseract = new TesseractOCR(asset('storage/text.png'));
        $tesseract->executable('/usr/bin/tesseract');
        var_dump($tesseract);
    }
}

Ouput:

/home/vagrant/code/project-io/app/Http/Controllers/OCRController.php:13:
object(thiagoalessio\TesseractOCR\TesseractOCR)[444]
  private 'image' => string 'http://project.test/storage/text.png' (length=38)
  private 'command' => null
  private 'executable' => string '/usr/bin/tesseract' (length=18)
  private 'options' => 
    array (size=0)
      empty

Knowing that http://project.test/storage/text.png is actually returning the image.

Tesseract is working with command line:

vagrant@xxx-yyy-zzz:~/code/project-io/public/storage$ tesseract text.png stdout
The quick brown fox
jumps over
the lazy dog.

Solution

  • Using laravel and Tesseract OCR for PHP, it seems that the constructor of TesseractOCR that receives the path to the image doesn't accept an URL as a parameter. As asset() returns the URL of the image, this won't work. This should be a strict path.

    $tesseract = new TesseractOCR(asset('storage/app/public/text.png')); // Incorrect
    

    Should be:

    $tesseract = new TesseractOCR(storage_path('app/public/text.png')); // Correct