azure-cognitive-services azure-form-recognizer

Extract text from required pages in PDF file

I am trying to use Form Recognizer - Azure cognitive service to extract text from pdf file. I am using custom model where by I train this service with my model and then try extracting data.

My PDF usually has more than 1 pages. But i am interested in extracting text from first page. Rest all pages does not have any importance.

So is there any way where I train my system to extract text from selected pages by giving page number?

Regards,

Madhu

Solution

The Form Recognizer API currently does not support page ranges for documents when training models. You might have to pre-process the document using 3rd party tools/APIs to only send the pages you need the model to be trained with.