Search code examples
azureazure-storageazure-cognitive-servicesazure-form-recognizer

where does azure form recognizer store results?


The Azure Cognitive Services Form Recognizer API accepts requests via a POST, and then makes the results available for 48 hours via a GET request to a resource:

POST "https://westus.api.cognitive.microsoft.com/formrecognizer/v2.1-preview.3/custom/models/{modelId}/analyze?includeTextDetails={boolean}

GET "https://westus.api.cognitive.microsoft.com/formrecognizer/v2.1-preview.3/custom/models/{modelId}/analyzeResults/{resultId}]

Presumably Azure stores these results somewhere, but I can't find any documentation regarding it. After 48 hours, is the data deleted or just made unavailable? Where does the data reside? Who owns the data? Does the account owner have access to the underlying storage account or database?


Solution

  • Check here in their official doc. It mentions of a "The following diagram illustrates how your data is processed." but I cannot see one 🤔

    Anyways, answering ahead....

    is the data deleted or just made unavailable?

    deleted

    The input data and results are deleted within 48 hours and not used for any other purpose. To learn more about privacy and security commitments, see the Microsoft Trust Center and cognitive services compliance and privacy.

    Where does the data reside? Who owns the data?

    Azure internal 😕

    The incoming data is processed in the same region where the Cognitive Services Azure resource was created. When you submit your documents to a Form Recognizer operation, it starts the process of analyzing the document to extract all text and identify structure and key values in a document. Your data and results are then temporarily encrypted and stored in Azure Storage.

    When you create a Form Recognizer resource in the Azure portal, you specify a region. From then on, your resource and all of its operations stay associated with that particular Azure server region.

    Does the account owner have access to the underlying storage account or database?

    Analyze Form Result API

    The "Get Analyze Results" operation is authenticated against the same API key that was used to call the "Analyze" operation to ensure no other customer can access your data.

    Azure temporarily stores the results for customers to retrieve: Analyze and Get Results are asynchronous calls. In other words, the service doesn't know when the customers will call the Get Results operation to fetch the extracted results. To facilitate checking the completion status and returning the extracted results to the customer upon completion, the extracted results are stored temporarily in Azure Storage. This behavior allows customers to poll the asynchronous Get Results operation for job completion status and fetch the results upon completion.

    There is already a similar ask from a user in feedback forum, you can vote that too to get product group's attention ✌