Search code examples
azureazure-cognitive-servicesazure-form-recognizer

Azure OCR [printed text] is not reading the receipt lines in the right order


Application Goal: read the receipt image, extract the store/organization name along with the total amount paid. Feed it to web-form for auto-filling & submission.

Post Request - "https://*.cognitiveservices.azure.com/vision/v2.0/recognizeText?{params}

Get Request - https://*.cognitiveservices.azure.com/vision/v2.0/textOperations/{operationId}

however when I get the results back, sometimes it's confusing in line ordering (see below picture [similar results in JSON response]) Screenshot from Azure Computer Vision Page

This mixing is resulting in getting the total as $0.88

Similar situations are present for 2 out of 9 testing receipts.

Q: Why it's working for similar & different structured receipts but for some reason not consistent for all? Also, any ideas how to get around it?


Solution

  • I had a quick look to your case.

    OCR Result

    As you mentioned, the results are not ordered as you thought. I had a quick look to the bounding boxes values and I don't know how they are ordered. You could try to consolidate fields based on that, but there is a service that is already doing it for you.

    Form Recognizer:

    Using Form Recognizer and your image, I got the following results for your receipt.

    As you can see below, the understandingResults contains the total with its value ("value": 9.11), the MerchantName ("Chick-fil-a") and other fields.

    {
        "status": "Succeeded",
        "recognitionResults": [
            {
                "page": 1,
                "clockwiseOrientation": 0.17,
                "width": 404,
                "height": 1226,
                "unit": "pixel",
                "lines": [
                    {
                        "boundingBox": [
                            108,
                            55,
                            297,
                            56,
                            296,
                            71,
                            107,
                            70
                        ],
                        "text": "Welcome to Chick-fil-a",
                        "words": [
                            {
                                "boundingBox": [
                                    108,
                                    56,
                                    169,
                                    56,
                                    169,
                                    71,
                                    108,
                                    71
                                ],
                                "text": "Welcome",
                                "confidence": "Low"
                            },
                            {
                                "boundingBox": [
                                    177,
                                    56,
                                    194,
                                    56,
                                    194,
                                    71,
                                    177,
                                    71
                                ],
                                "text": "to"
                            },
                            {
                                "boundingBox": [
                                    201,
                                    56,
                                    296,
                                    57,
                                    296,
                                    71,
                                    201,
                                    71
                                ],
                                "text": "Chick-fil-a"
                            }
                        ]
                    },
    ...
    OTHER LINES CUT FOR DISPLAY
    ...
                ]
            }
        ],
        "understandingResults": [
            {
                "pages": [
                    1
                ],
                "fields": {
                    "Subtotal": null,
                    "Total": {
                        "valueType": "numberValue",
                        "value": 9.11,
                        "text": "$9.11",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/32/words/0"
                            },
                            {
                                "$ref": "#/recognitionResults/0/lines/32/words/1"
                            }
                        ]
                    },
                    "Tax": {
                        "valueType": "numberValue",
                        "value": 0.88,
                        "text": "$0.88",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/31/words/0"
                            },
                            {
                                "$ref": "#/recognitionResults/0/lines/31/words/1"
                            },
                            {
                                "$ref": "#/recognitionResults/0/lines/31/words/2"
                            }
                        ]
                    },
                    "MerchantAddress": null,
                    "MerchantName": {
                        "valueType": "stringValue",
                        "value": "Chick-fil-a",
                        "text": "Chick-fil-a",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/0/words/2"
                            }
                        ]
                    },
                    "MerchantPhoneNumber": {
                        "valueType": "stringValue",
                        "value": "+13092689500",
                        "text": "309-268-9500",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/4/words/0"
                            }
                        ]
                    },
                    "TransactionDate": {
                        "valueType": "stringValue",
                        "value": "2019-06-21",
                        "text": "6/21/2019",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/6/words/0"
                            }
                        ]
                    },
                    "TransactionTime": {
                        "valueType": "stringValue",
                        "value": "13:00:57",
                        "text": "1:00:57 PM",
                        "elements": [
                            {
                                "$ref": "#/recognitionResults/0/lines/6/words/1"
                            },
                            {
                                "$ref": "#/recognitionResults/0/lines/6/words/2"
                            }
                        ]
                    }
                }
            }
        ]
    }
    

    More details on Form Recognizer: https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/