Currently using the google cloud vision api for pulling text from images of documents.
Current situation - the API works great, and returns tons of data including the bounding boxes of where the words are located.
Desired outcome - to query only the words pulled from the image and not all the meta data about where the bounding boxes and vertices of the words are (it's like 99% of the response and comes out to be about 250k which is a huge waste when all I want are just the words)
const vision = require('@google-cloud/vision');
const client = new vision.ImageAnnotatorClient();
// Performs label detection on the image file
client
.documentTextDetection('../assets/images_to_ocr/IMG_0942-min.jpg')
.then(results => {
console.log('result:', result);
})
.catch(err => {
console.error('ERROR:', err);
});
For now, the Google Cloud Vision client library for nodeJS does not have an option for requesting partial responses like the ones you are asking. Anyway, if you just want to show the text and not any of the other metadata, you can filter the response like this:
const fullTextAnnotation = results[0].fullTextAnnotation;
console.log(`Full text: ${fullTextAnnotation.text}`);
You will get the full response in 'fullTextAnnotation', then you can get fullTextAnnotation.text to get only the text with ‘\n’ characters to separate the text blocks, without any metadata.
In case you are interested in using something else instead of nodeJS, the Java client library has the setFields() method for the Annotate class and also from the API Explorer you can use a partial fields mask to see the effect.