python image-processing google-cloud-platform google-cloud-vision

License plate recognition using GCP's Cloud Vision

We are developing a mobile app that will allow users to upload images that will be stored in a GCP bucket. However before saving to the bucket we want to blur out any faces and license plates that may be present. We have been using a call to GCP's Cloud Vision service to annotate the image for faces and this has worked quite well. However license plates annotation has turned out to be more challenging. There is no option to detect license plates specifically but instead we seem limited to text-detection which catches license plates, but also all other text that is in image. This is not what we want.

Any pointers on how we might better narrow down the text-recognition to just license plates?

Here's an example of the Python code we are currently using to detect and gather annotation data for faces and text:

from google.cloud import vision
...
def __annotate(image_storage_url):
    result = []

    client = vision.ImageAnnotatorClient()

    response = client.annotate_image({
        'image': {'source': {'image_uri': image_storage_url}},
        'features': [
            {'type': vision.enums.Feature.Type.FACE_DETECTION}, #works great
            {'type': vision.enums.Feature.Type.TEXT_DETECTION}, #too broad
        ],
    })

    # record facial annotations
    faces = response.face_annotations
    for face in faces:
        vertices = [(vertex.x, vertex.y)
                    for vertex in face.bounding_poly.vertices]
        result.append(vertices)

    # record plate annotations
    texts = response.text_annotations
    for text in texts:
        vertices = [(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices]
        result.append(vertices)

    return result

Thanks

Update April 28, 2020 Thanks to an answer from Obed Macallums down below (now marked as answer) I now have working Python code using GCP Cloud Vision to detect and blur license plates on uploaded images to GCP Storage. Here is the relevant Python code:

from google.cloud import vision
...

def __annotate(image_storage_url, img_dimensions):
    result = []

    client = vision.ImageAnnotatorClient()

    response = client.annotate_image({
        'image': {'source': {'image_uri': image_storage_url}},
        'features': [
            {'type': vision.enums.Feature.Type.FACE_DETECTION},
            {'type': vision.enums.Feature.Type.TEXT_DETECTION},
            {'type': vision.enums.Feature.Type.OBJECT_LOCALIZATION},
        ],
    })

    # Blur faces
    faces = response.face_annotations
    for face in faces:
        vertices = [(vertex.x, vertex.y)
                    for vertex in face.bounding_poly.vertices]
        LOGGER.debug('Face detected: %s', vertices)
        result.append(vertices)

    # Blur license plates
    # Note: localized_object_annotations use normalized_vertices which represent the relative-distance
    # (between 0 and 1) and so must be multiplied using the image's height and width
    lo_annotations = response.localized_object_annotations
    for obj in lo_annotations:
        if obj.name == 'License plate':
            vertices = [(int(vertex.x * img_dimensions['width']), int(vertex.y * img_dimensions['height']))
                        for vertex in obj.bounding_poly.normalized_vertices]
            LOGGER.debug('License plate detected: %s', vertices)
            result.append(vertices)

    return result

Solution

Try to use OBJECT_LOCALIZATION, FACE_DETECTION and TEXT_DETECTION in the same image, and filter by "License plate" in OBJECT_LOCALIZATION , with this you could do everything you wanted.