Search code examples
pythonjsongoogle-cloud-platformgoogle-cloud-vision

Parsing and saving Google Vision API response as JSON


I am currently testing out the Google Vision API for some basic handwritten text recognition and have no troubles getting a decent response for my image.

However, I am struggling to save the received response locally. I have tried several ways of parsing the AnnotateImageResponse. I tried:

  • using MessageToJson and MessageToDict from google.protobuf.json_format
  • response.SerializeToString() inside json.loads()
  • saving the response as a binary file and reloading it from the disk for JSON parsing
  • saving only parts of the response (response.text_annotations and response.full_text_annotation individually)

I keep getting an error telling me that the object does not have an attribute called 'DESCRIPTOR'.

When I looked into the response.txt file I created from turning the response into a String, I realized that the full_text_annotations do not have said descriptor tag on the top level keys, so I thought saving only the text_annotations would solve this. Unfortunately not, I am still getting the same error.

This is my code so far (not working)

import io
import os
import json
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from google.cloud import vision
from google.protobuf.json_format import MessageToJson
from google.protobuf import json_format

vision_client = vision.ImageAnnotatorClient()
path = './images/'
name = 'test.jpg'
with io.open(path+name, 'rb') as image_file:
    opened_image = image_file.read()
image = vision.Image(content=opened_image)
response = vision_client.document_text_detection(image=image, image_context={"language_hints": ["en-t-i0-handwrit"]})

# tried extracting only whole words here - doesn't work
all_words = response.text_annotations
all_words_json = MessageToJson(all_words)

# Causes AttributeError: 'RepeatedComposite' object has no attribute 'DESCRIPTOR'

All help is appreciated, either with regards to how to turn the response into a JSON file directly or how to correctly iterate over it to turn it into a JSON that way. Thanks!

This is a part of the received Cloud Vision response:


...

text_annotations {
  description: "Congratulations"
  bounding_poly {
    vertices {
      x: 2334
      y: 2452
    }
    vertices {
      x: 3284
      y: 2464
    }
    vertices {
      x: 3282
      y: 2615
    }
    vertices {
      x: 2332
      y: 2603
    }
  }
}
text_annotations {
  description: "!"
  bounding_poly {
    vertices {
      x: 3321
      y: 2464
    }
    vertices {
      x: 3411
      y: 2465
    }
    vertices {
      x: 3409
      y: 2615
    }
    vertices {
      x: 3319
      y: 2614
    }
  }
}
full_text_annotation {
  pages {
    property {
      detected_languages {
        language_code: "en"
        confidence: 0.991942465
      }
    }
    width: 4032
    height: 3024
    blocks {
      bounding_box {
        vertices {
          x: 446
          y: 486
        }
        vertices {
          x: 3541
          y: 475
        }
        vertices {
          x: 3549
          y: 2618
        }
        vertices {
          x: 454
          y: 2629
        }
      }
      paragraphs {
        bounding_box {
          vertices {
            x: 2490
            y: 912
          }
          vertices {
            x: 2633
            y: 910
          }
          vertices {
            x: 2634
            y: 957
          }
          vertices {
            x: 2491
            y: 959
          }
        }
        words {
          bounding_box {
            vertices {
              x: 2490
              y: 912
            }
            vertices {
              x: 2633
              y: 910
            }
            vertices {
              x: 2634
              y: 957
            }
            vertices {
              x: 2491
              y: 959
            }
          }
          symbols {
            bounding_box {
              vertices {
                x: 2490
                y: 913
              }
              vertices {
                x: 2552
                y: 912
              }
              vertices {
                x: 2553
                y: 958
              }
              vertices {
                x: 2491
                y: 959
              }
            }
            text: "a"
            confidence: 0.961887
          }
          symbols {
            property {
              detected_break {
                type_: LINE_BREAK
              }
            }
            bounding_box {
              vertices {
                x: 2552
                y: 911
              }
              vertices {
                x: 2633
                y: 910
              }
              vertices {
                x: 2634
                y: 955
              }
              vertices {
                x: 2553
                y: 956
              }
            }
            text: "m"
            confidence: 0.957640529
          }
          confidence: 0.959763765
        }
        confidence: 0.959763765
      }
      paragraphs {
        bounding_box {
          vertices {
            x: 471
            y: 485
          }

...


Solution

  • I figured it out now: the AnnotateImageResponse is a ProtoBuffer object, so a special notation needs to be used when passing the response to the MessageToJson function:

    # response is a ProtoBuffer object, so use ._pb notation
    json_response = MessageToJson(response._pb)
    
    # save response_json as a .json file
    with open('response.json', 'w') as json_file:
        json_file.write(json_response)
    

    Note that we pass response._pb instead of response to MessageToJson().