Cloud Vision API - PDF OCR

I just tested the Google Cloud Vision API to read the text, if exist, in a image.

Until now I installed the Maven Server and the Redis Server. I just follow the instructions in this page.

Until now I was able to tested with .jpg files, is it possible to do it with tiff files or pdf??

I am using the following command:

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar     com.google.cloud.vision.samples.text.TextApp ../../data/text/

Inside the text directory, I have the files in jpg format.

Then to read the converted file, I don't know how to do that, just I run the following command

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp

And I get the message to enter a word or phrase to search in the converted files. Is there a way to see the whole document transformed?

Thanks!

Solution

In 2016 PDF and TIFF formats was not supported for Cloud Vision.

The accepted formats are : (taken from the the doc)

But now are added.

Docs for jpg:

Docs for pdf