Search code examples
pythongoogle-cloud-vision

Google cloud vision api detects different number of labels


As per this page, the following code snippet returns 5 labels:

from google.cloud import vision
url = 'https://farm9.staticflickr.com/8215/8267748261_ea142faf5e.jpg'

client = vision.ImageAnnotatorClient()
client.label_detection({'source': {'image_uri': url}}) # yields 5

When I do it as described here, I get 10 labels:

client = vision.Client()
image = client.image(source_uri=url)
labels = image.detect_labels() # yields 10

When I use the Cloud Vision demo page, I get 18 labels for the same image.

Why do these approaches all differ? What am I missing here?


Solution

  • TL;DR - image.detect_labels takes an optional limit parameter which has a default value of 10 and hence you get only 10 labels in your second version. If you increase the limit to a value higher than 18, you will get the same result as the one you observed on the Cloud Vision demo page.

    Doc for detect_labels()

    Help on method detect_labels in module google.cloud.vision.image:

    detect_labels(self, limit=10) method of google.cloud.vision.image.Image instance

    Detect labels that describe objects in an image.
    
    :type limit: int
    :param limit: The maximum number of labels to try and detect.
    
    :rtype: list
    :returns: List of :class:`~google.cloud.vision.entity.EntityAnnotation`
    

    Working example using Image.detect_labels()

    Try this:

    from google.cloud import vision
    
    IMAGE_URL = 'https://farm9.staticflickr.com/8215/8267748261_ea142faf5e.jpg'
    
    vision_client = vision.Client()
    
    image = vision_client.image(source_uri=IMAGE_URL)
    labels = image.detect_labels(limit=100)
    print('Label Count: {0}'.format(len(labels))) # Result is 18
    print('Labels:')
    for label in labels:
        print(label.description)
    

    Working example using ImageAnnotatorClient.annotate_image()

    You can set the max number of results (which defaults to 5 here) when using ImageAnnotatorClient too, although the request gets slightly verbose:

    from google.cloud import vision
    
    IMAGE_URL = 'https://farm9.staticflickr.com/8215/8267748261_ea142faf5e.jpg'
    
    annot_client = vision.ImageAnnotatorClient()
    request_image = {'source': {'image_uri': IMAGE_URL}}
    label_detection_feature = {
        'type': vision.enums.Feature.Type.LABEL_DETECTION, 'max_results': 100}
    request_features = [label_detection_feature]
    response = annot_client.annotate_image(
        {'image': request_image, 'features': request_features})
    print('Label Count: {0}'.format(len(response.label_annotations))) # Result is 18
    

    Example using ImageAnnotatorClient.label_detection()

    If you use ImageAnnotatorClient.label_detection() directly, it always defaults to a maximum of 5 results and there does not seem to be a way to configure this limit.

    from google.cloud import vision
    
    IMAGE_URL = 'https://farm9.staticflickr.com/8215/8267748261_ea142faf5e.jpg'
    
    annot_client = vision.ImageAnnotatorClient()
    response = annot_client.label_detection(image={'source': {'image_uri': IMAGE_URL}})
    print('Label Count: {0}'.format(len(response.label_annotations))) # Result is 5