Search code examples
azure-functionsazure-form-recognizer

Form recognizer custom model - how to use a python parser with the API GET response


I'm making a solution who extract pdf content through a Form recognizer Custom model and add the result to a SQL server database.

With the help of the JSON delivered by the Form recognizer UI, i built a python JSON parser who fits to my custom model and successfully add rows to SQL server when i give my function a variable who is the result of the json.load('example.json').

Now, i try to give in argument of my parsing function the "result" generated by the API GET response but failed to make it work. I try many solution but always fall on almost the same error:

[2022-10-02T08:12:40.755Z] System.Private.CoreLib: Exception while executing function: Functions.BlobTrigger1. System.Private.CoreLib: Result: Failure
Exception: TypeError: 'AnalyzeResult' object is not subscriptable

Here is how my parser is working :

def insert_json_into_mssql(result_json):
    try:

        analyzeResult = result_json["analyzeResult"]
        documents_list = analyzeResult["documents"]

        connection_string = "Driver={ODBC Driver 17 for SQL Server};Server=tcp:xxxxxxserverdev.database.windows.net,1433;Database=nip_facturation_dev;Uid=xxxxxxxx;Pwd=xxxxxxxxxx;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;"
        mssql_con = pyodbc.connect(connection_string)
        mssql_con.setdecoding(pyodbc.SQL_CHAR, encoding='UTF-8')
        mssql_con.setencoding('UTF-8')
        cursor = mssql_con.cursor()

             

        x = 0
        for doc in documents_list :
            x = x+1
            print("Processing document index "+str(x))

            fields = doc["fields"]


            if "enseigne" in fields:
                enseigneO = fields["enseigne"]
                enseigne = enseigneO["content"]
                print("enseigne= "+str(enseigne))
            else:
                enseigne = None
                print("enseigne= "+str(enseigne))

and that's how i call the api and get the result:

def main(myblob: func.InputStream):
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")    

    
    endpoint = "https://westeurope.api.cognitive.microsoft.com/"
    api_key = "xxxxxxxxxxxxxxxxxxxxxxx"
    credential = AzureKeyCredential(api_key)
    source = myblob.read()
    model_id = "my_model" 

    credential = AzureKeyCredential(api_key)

    document_analysis_client = DocumentAnalysisClient(endpoint, credential)

        
    poller = document_analysis_client.begin_analyze_document(model_id, document=source)
    result_json = poller.result()

   
    insert_json_into_mssql(result_json) 

I know that i miss a step between the API get response and the way i fill the result to my parser. Ideally, i would like to to be able to read the response without writing the result as a json file in storage blob.

Thanks :)


Solution

    • poller.result()will not give you a Json on contrary it will give you an object of class AnalyzeResult

    • The object can then be converted to a key value pair. The object can be iterated as a key value pair using .key_value_pairs variable. It will be used like this: result.key_value_pairs where result is the result of the form recognizer's operations.

    Consider the following code where we are using form recognizer and iterating through its findings:

    endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
    key = "YOUR_FORM_RECOGNIZER_KEY"
    
    formUrl = ""
    
    document_analysis_client = DocumentAnalysisClient(
            endpoint=endpoint,
            credential=AzureKeyCredential(key)
        )
    
    poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-document", formUrl)
    
    result = poller.result()
    
    print("----Key-value pairs found in document----")
    
    for  kv_pair  in  result.key_value_pairs:
        if  kv_pair.key and  kv_pair.value:
            print("Key '{}': Value: '{}'".format(kv_pair.key.content, kv_pair.value.content))
        else:
            print("Key '{}': Value:".format(kv_pair.key.content))
    

    result: enter image description here

    Refer this MSDOC on DocumentAnalysisClient class

    Refer this MSDOC on Analysresult class