Search code examples
javapdfjbossitextitextpdf

Itextpdf stop transform pdf correctly


I have a next issue with itextpdf.

private void generatePdf() throws Exception {
    FileOutputStream fos = null;
    try {
        PdfReader reader = new PdfReader("template.pdf");
        fos = new FileOutputStream("test.pdf");
        PdfStamper stamper = new PdfStamper(reader, fos);

        stamper.close();
    } catch (Exception e) {
        throw e;
    } finally {
        if (fos != null) {
            try {
                fos.close();
            } catch (IOException e) {
                throw new Exception(e);
            }
        }
    }
}

This method have to read a template and save that to a new pdf. But if I looked into a result pdf I just see blank pages (4 - the same amount as a template has). What interesting that this method is invoked in a context of web app on jboss server. But when I invoke this method like main method in simple java application (Class with main() method) it works fine. Also what can I add that the template has editable fields that have to be filled in future but nothing edits now. Can anybody assume what can be wrong here?

Best Regards, Sergey


Solution

  • The cause

    In comments it turned out that the OP creates his web application in maven, that the template.pdf file is supplied as a maven resource, and that filtering (i.e. text variable replacements) of the resources is activated.

    Unfortunately, though, filtering resources implies that the resource files are treated as text files eventually stored using UTF-8 character encoding.

    This essentially destroyed all compressed stream contents (especially page contents and font programs) and some meta information strings, and also rendered the cross references incorrect (writing as UTF-8 introduced additional bytes which shifted offsets).

    iText could still read the PDF after creating a cross reference table for the mangled file because outside those streams and strings the structure was still correct. The result of writing the read mangled PDF, therefore, contained the right number of pages and some form fields, but the page contents were lost.

    The cure

    The solution is to not filter PDF resources. This can e.g. be done as explained here on the Apache Maven site:

    By default, files with extensions (jpg, jpeg, gif, bmp and png) won't be filtered anymore.

    Users can add some extra file extensions to not apply filtering with the following configuration :

    <project>
      ...
      <build>
        <plugins>
          <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-resources-plugin</artifactId>
            <version>2.7</version>
            <configuration>
              ...
              <nonFilteredFileExtensions>
                <nonFilteredFileExtension>pdf</nonFilteredFileExtension>
                <nonFilteredFileExtension>swf</nonFilteredFileExtension>
              </nonFilteredFileExtensions>
              ...
            </configuration>
          </plugin>
        </plugins>
        ...
      </build>
      ...
    </project>