Search code examples
pythonamazon-web-servicesamazon-rekognition

How to use desired data from rekognition image text


I am using the below code in combination with AWS service Recoknition to analyze text in images. I am only looking for one specific line item in the picture that is a 10 digit number. I want my program to determine if there is a 10 digit number in the photograph. If TRUE = response A, if FALSE response B.

Step 1. How do I modify my response to print a output that only contains this 10 digit number IF it is found in the photo? How do I save this number as a variable in the program?

For reference here is my current code and a portion of the output I am getting. I only want line 5 but it will not always be line 5 with the same geometry because the angles and size of the tags in the photograph will change. The image I am trying to test analyze is also attached.

Here's the image

Here's the image

This is my code:

import csv
import boto3 


with open('credentials.csv' , 'r') as input:
    next(input)
    reader = csv.reader(input)
    for line in reader: 
        access_key_id = line[2]
        secret_access_key = line[3]

photo = 'DSCN4898.JPG'

client = boto3.client('rekognition',
aws_access_key_id = access_key_id, 
aws_secret_access_key = secret_access_key,
region_name = 'us-east-2',
)

response = client.detect_text(Image={'S3Object': {
            'Bucket': 'phare.lumberid',
            'Name': photo} 
    }     ) 


print(response.text)

{'TextDetections': [{'DetectedText': 'GR', 'Type': 'LINE', 'Id': 0, 'Confidence': 98.77118682861328, 'Geometry': {'BoundingBox': {'Width': 0.004206564277410507, 'Height': 0.0020818368066102266, 'Left': 0.7285668849945068, 'Top': 0.2874905467033386}, 'Polygon': [{'X': 0.7285668849945068, 'Y': 0.2874905467033386}, {'X': 0.7327734231948853, 'Y': 0.22823381423950195}, {'X': 0.7849873304367065, 'Y': 0.2303156554698944}, {'X': 0.7807807922363281, 'Y': 0.2895723879337311}]}}, {'DetectedText': '3B', 'Type': 'LINE', 'Id': 1, 'Confidence': 99.73309326171875, 'Geometry': {'BoundingBox': {'Width': 0.0030030030757188797, 'Height': 0.0015003751032054424, 'Left': 0.5345345139503479, 'Top': 0.29482370615005493}, 'Polygon': [{'X': 0.5345345139503479, 'Y': 0.29482370615005493}, {'X': 0.5375375151634216, 'Y': 0.24456113576889038}, {'X': 0.5905905961990356, 'Y': 0.24531133472919464}, {'X': 0.5885885953903198, 'Y': 0.29632407426834106}]}}, {'DetectedText': '11/16/2020', 'Type': 'LINE', 'Id': 2, 'Confidence': 99.93339538574219, 'Geometry': {'BoundingBox': {'Width': 0.013013012707233429, 'Height': 0.0022505626548081636, 'Left': 0.3553553521633148, 'Top': 0.4043510854244232}, 'Polygon': [{'X': 0.3553553521633148, 'Y': 0.4043510854244232}, {'X': 0.36836835741996765, 'Y': 0.23255814611911774}, {'X': 0.4154154062271118, 'Y': 0.23405851423740387}, {'X': 0.402402400970459, 'Y': 0.4066016376018524}]}}, {'DetectedText': 'RO', 'Type': 'LINE', 'Id': 3, 'Confidence': 99.86873626708984, 'Geometry': {'BoundingBox': {'Width': 0.0030030030757188797, 'Height': 0.0015003751032054424, 'Left': 0.5195195078849792, 'Top': 0.4606151580810547}, 'Polygon': [{'X': 0.5195195078849792, 'Y': 0.4606151580810547}, {'X': 0.522522509098053, 'Y': 0.40735185146331787}, {'X': 0.575575590133667, 
'Y': 0.408852219581604}, {'X': 0.5725725889205933, 'Y': 0.4621155261993408}]}}, {'DetectedText': '10', 'Type': 'LINE', 'Id': 4, 'Confidence': 99.67726135253906, 'Geometry': {'BoundingBox': {'Width': 0.0010010009864345193, 'Height': 0.0, 'Left': 0.35035035014152527, 'Top': 0.5011252760887146}, 'Polygon': [{'X': 0.35035035014152527, 'Y': 0.5011252760887146}, {'X': 0.3513513505458832, 'Y': 0.46436607837677}, {'X': 0.3953953981399536, 'Y': 0.46436607837677}, {'X': 0.3943943977355957, 'Y': 0.5011252760887146}]}}, **{'DetectedText': '0000014819', 'Type': 'LINE', 'Id': 5,** 'Confidence': 98.70645904541016, 'Geometry': {'BoundingBox': {'Width': 0.026078984141349792, 'Height': 0.003469466231763363, 'Left': 0.41238895058631897, 'Top': 0.5906790494918823}, 'Polygon': [{'X': 0.41238895058631897, 'Y': 0.5906790494918823}, {'X': 0.43846791982650757, 'Y': 0.2778533399105072}, {'X': 0.5125654935836792, 'Y': 0.28132280707359314}, {'X': 0.4864864945411682, 'Y': 0.5941485166549683}]}}, {'DetectedText': '04', 'Type': 'LINE', 'Id': 6, 'Confidence': 98.94153594970703, 'Geometry': {'BoundingBox': {'Width': 0.0037548313848674297, 'Height': 0.002500824397429824, 'Left': 0.5068372488021851, 'Top': 0.5926163792610168}, 'Polygon': [{'X': 0.5068372488021851, 'Y': 0.5926163792610168}, {'X': 0.5105921030044556, 'Y': 0.5481716394424438}, {'X': 0.5632959604263306, 'Y': 0.5506724715232849}, {'X': 0.5595411062240601, 'Y': 0.5951172113418579}]}}, {'DetectedText': 'PACKS', 'Type': 'LINE', 'Id': 7, 'Confidence': 

Solution

  • If you want to use pandas to look at the data (in ipython or some other IDE you can do this):

    import pandas as pd
    df = pd.DataFrame(response['TextDetections'])
    
    In [192]: pd.DataFrame(response['TextDetections'])
         ...:
    Out[192]:
                  DetectedText  Type  Id  Confidence                                           Geometry  ParentId
    0                       GR  LINE   0   99.072647  {'BoundingBox': {'Width': 0.005556055344641209...       NaN
    1                       3B  LINE   1   99.752129  {'BoundingBox': {'Width': 0.002002001972869038...       NaN
    2               11/16/2020  LINE   2   99.937080  {'BoundingBox': {'Width': 0.013013012707233429...       NaN
    3                       RO  LINE   3   99.875778  {'BoundingBox': {'Width': 0.003003003075718879...       NaN
    4                       10  LINE   4   99.780312  {'BoundingBox': {'Width': 0.001001000986434519...       NaN
    5               0000014819  LINE   5   99.577316  {'BoundingBox': {'Width': 0.02471441961824894,...       NaN
    6                       04  LINE   6   99.152405  {'BoundingBox': {'Width': 0.003001040546223521...       NaN
    7                    PACKS  LINE   7   99.902817  {'BoundingBox': {'Width': 0.010937293991446495...       NaN
    8   River Valley Hardwoods  LINE   8   99.724754  {'BoundingBox': {'Width': 0.014965030364692211...       NaN
    9                       3B  WORD  10   99.752129  {'BoundingBox': {'Width': 0.0503024198114872, ...       1.0
    10                      GR  WORD   9   99.072647  {'BoundingBox': {'Width': 0.05872828885912895,...       0.0
    11              11/16/2020  WORD  11   99.937080  {'BoundingBox': {'Width': 0.17228509485721588,...       2.0
    12                      RO  WORD  12   99.875778  {'BoundingBox': {'Width': 0.05334790423512459,...       3.0
    13                      10  WORD  13   99.780312  {'BoundingBox': {'Width': 0.03677281737327576,...       4.0
    14               Hardwoods  WORD  19   99.487328  {'BoundingBox': {'Width': 0.10112638026475906,...       8.0
    15              0000014819  WORD  14   99.577316  {'BoundingBox': {'Width': 0.31150540709495544,...       5.0
    16                      04  WORD  15   99.152405  {'BoundingBox': {'Width': 0.045055754482746124...       6.0
    17                  Valley  WORD  18   99.796745  {'BoundingBox': {'Width': 0.05704939365386963,...       8.0
    18                   PACKS  WORD  16   99.902817  {'BoundingBox': {'Width': 0.10187213867902756,...       7.0
    19                   River  WORD  17   99.890190  {'BoundingBox': {'Width': 0.05252266675233841,...       8.0
    

    You can visually see the DetectedText and the Confidence levels, but to answer your question to look 10-digit numbers programmatically, you can loop the DetectedText and analyze each one individually. If you know what that the text is always in a certain format you can test for that. This should get you started

    In [198]: for text in response['TextDetections']:
         ...:     # print(text['DetectedText'])
         ...:     if len(text['DetectedText']) == 10:
         ...:         print(text['DetectedText'])
         ...:         print(text['DetectedText'].isnumeric())
         ...:
    11/16/2020
    False
    0000014819
    True
    11/16/2020
    False
    0000014819
    True