Search code examples
pdfitextpdf

itextpdf-5.4.3 throws com.itextpdf.text.pdf.parser.InlineImageUtils$InlineImageParseException: EI not found after end of image data


I am receiving the EI not found error in this specific pdf found under https://bfs.ever-team.com/files/6fce4cef9769e40d1994e684a881d4bf/facture3_1.pdf.

I am using itextpdf-5.4.3 jar and below is the code:

com.itextpdf.awt.geom.Rectangle rec = new com.itextpdf.awt.geom.Rectangle(307, 728, 742, 400);
RenderFilter filter = new RegionTextRenderFilter(rec);

TextExtractionStrategy  strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter);
String currentText = PdfTextExtractor.getTextFromPage(reader, i , strategy);

Method getTextFromPage is returning the error, I checked other threads but it was mentioned that this error should be fixed in the latest jar, but it seems it is not facture3_1.pdfworking for my file. Can anyone advise please.


Solution

  • A crosspost of this question has been answered on the iText mailing list. To close the question here, too, that answer is copied here:

    The issue can be reproduced with iText 5.4.3 but not with the current development snapshot. The OP, therefore, should update his iText version.

    InlineImageParseException: EI not found after end of image data
    

    EI denotes the end of an inline image. The handling of inline images is tricky and not strictly well-defined. iText recently improved its handling of inline images to correctly parse more PDFs with such inline images.