Search code examples
azure-form-recognizer

Is Model Training Required?


New to AFR, please pardon stupid question. We're a book publisher, sometimes there is no specific format layout at all because different authors have preferences even on the same subject, for example recipes. However they are all wrapped into a page.

So my question is there a generic layout available so we don't need to do a training? This way we'll utilize the boundingBox to reconstruct the layout and piece them together. A URL of a sample will be great. Thank you


Solution

  • Yes, you can use Form Recognizer Layout to extract the text and tables from the books and analyze the pages. You can try it out using the sample tool UX - select Layout,

    Or use the API - https://{endpoint}/formrecognizer/v2.1-preview.3/layout/analyze?readingOrder=natural

    see here for more information - https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/concept-layout

    Use reading order natural to get the text extracted in reading order if the page has different columns or other text in different grouping that will extract it in reading order.