Because FR v3.0 is still Preview mode, so I went v2.1 Quickstarts, "Analyze using a Prebuilt model", Navigate to the Form Recognizer Sample Tool. Using Form Type = "Invoice" to test many size and text including handwriting, very happy with the results, especially returned JSON file structure:
...
"analyzeResult":
{
...
readResults:[...],
pageResults:[...]
...
}
For large/complex image/doc, use pageResults.tables[0].cells
based on rowIndex
and columnIndex
, I can easily piece each row text restoring the whole doc. For small/simple image/doc or when pageResults.tables.length==0
, use readResults.lines
achieve the same OCR outcome, like one-size-fits-all, perfect!
Next is my own hands-on for the same images, Samples, JavaScript. Because I've been using Invoice only, so I picked recognizeInvoice.js, great sample, easy and simple to follow. Even it's v3 and missing readResults
and pageResutls
, I'm still able to use invoice.pages[0].tables[0].cells
achieve the same result for large/complex image/doc. For small/simple image found 2 issues:
invoice.pages[0].tables.length = 0
, so no text values.NRT LLC.
of invoice.fields.VendorName.value
, all other printed text and handwriting returned by v2.1 are gone!I believe there must be some reasons at MS side for the above changes, for us it means v3 is not backward compatible. And more importantly we wouldn't be able to know if the image fits a model and/or will return something before submitting, even we provide a list of choices of models users may frustrate by extra manual work. At the moment all we can do is switching back to Google. So,
Below is my navigation route. Thank you and really appreciate the great work!
It is a bit confusing, but the versions of the @azure/ai-form-recognizer
package on NPM are one major version ahead of the Form Recognizer API versions. The preview API version "2021-09-30-preview" (REST API "v3") can be used with Form Recognizer SDK version 4.0.0-beta.2
. REST API version v2.1 (GA) is used with SDK version 3.2.0. On the README for @azure/ai-form-recognizer
3.2.0, it explains this:
Note: This package targets Azure Form Recognizer service API version 2.x.
I'm guessing based on what you've said that you are using the latest stable version 3.2.0 of the SDK. When extracting data using a prebuilt or custom model in this version, tables
are attached to pages
, and pages
are attached to Forms, so you can access a table by looking through the forms:
const poller = await client.beginRecognizeInvoices(inputs);
const invoices = await poller.pollUntilDone();
const table = invoices[0].pages[0].tables[0];
If a table appears on a page that isn't associated with any form (no form appears on that page), it can't be accessed using this method. That feature is present in the new beta SDK for the new preview API, but in the current SDK to get all pages (regardless of whether or not they contain a form), you could consider using the beginRecognizeContent
method.