Search code examples
pdfitextitextpdf

How to insert bytes into a PDF then separate it without affecting the file?


I'm trying to make signatures for some PDF files, it should have been easy. But my country created a set of standards. Following these standards, I have to upload my files to a third-party API and get a signature file(SF) in return. If I want to verify those files, I upload both the file and SF to another third-party API.

I want to issue these documents to my users. Each file having one SF alongside it is stupid. So Firstly I tried to use the SF to insert a PDF signature into my PDF file. But SF are generated using algorithms invented by government and are not supported in PDF standard.

Now my idea is: Insert SF into somewhere of my PDF, if a user want to verify it, he upload this file to me, and I'll separate PDF and SF, then call the API to verify.

Now the problems are:

  1. Where to insert bytes while keeping the PDF readable?
  2. How can I make sure the PDF after separating is exactly the same as the original one?

I'm using iText. Thanks for the reading and any help.


Solution

  • You can use iText to add attachments to a PDF file. There are two flavors of attachment.

    • Attachment annotations: there's a visual object on the page (e.g. a paper clip) and when the user clicks that visual object, the attachment opens. See File attachment annotation.
    • Embedded files: these are document-level attachments. They aren't visible anywhere on a page, but most PDF viewers have an "attachment panel" that can be opened and end users will see the attachments there. See embedded files.

    Choose whichever type of attachment you like most, and then you can use PdfStamper to add such an attachment to your PDF. See for instance How to load a PDF from a stream and add a file attachment? (for an example in C#) or How to delete attachments in PDF using iText? (for an example in Java that adds, and then deletes the attachment).

    Isn't this question a duplicate of the questions I mention? No, certainly not, because the examples I wrote in answer to those questions change the bytes of the original PDF document. When those bytes are changed, the exotic signature imposed by your country (a country that made a very bad decision by not using real digital signatures as described in PAdES) will break. That is the real question: how do you add the attachment whilst preserving the original bytes?

    This is explained in my answer to the question Why do PDFs change when processing them?

    In this question, I explain how to manipulate a PDF in append mode:

    PdfStamper stamper = new PdfStamper(reader,
        new FileOutputStream(dest), '\0', true);
    

    A PDF file looks like this:

    %PDF-1.7
    // Original PDF syntax
    %%EOF
    

    When we use PdfStamper, we typically end up with a file like this:

    %PDF-1.7
    // Altered PDF syntax
    %%EOF
    

    When we use PdfStamper in append mode, we end up with a file like this:

    %PDF-1.7
    // Original PDF syntax
    %%EOF
    // Some new PDF syntax
    %%EOF
    

    In other words: iText doesn't touch the original syntax: all the bytes are preserved. In order to obtain the original bytes, you need to remove all the bytes up until the original %%EOF.

    Update:

    @mkl added a comment about creating a portfolio aka a portable collection. A portable collection is a PDF that acts as a ZIP file. These are some examples:

    You could use the original PDF as the cover page and the signature file as an embedded file. The advantage of using a portable collection when compared with my earlier suggestion is that the end user doesn't need to throw away bytes from his PDF. He can just extract the original PDF from the portable collection using a PDF viewer.