Search code examples
c#pdfadobepdf-generation

Best way to create/fill-in printed forms and pdfs?


We have a C# application that must print complex forms. Things like multi-page government compliance forms that must be in a specific format. We can get PDF copies of these forms and create form fields, but aren't sure how to fill in this data and create a PDF that can be auto-printed and sent to our clients (they need paper copies).

Also, some of the forms are dynamic, in that certain pages must be repeated (for example, for an employee equal opportunity audit report we might need to include 5 copies of a page in the form if it holds 50 employees but the client has 250).

In general, what's the best way to populate and print these forms? Note that our application is C#-based, but any solution in any language/app is welcome (we're open to buying software or integrating with other frameworks if needed).

For example - what would something like TurboTax use to print out the hundreds of tax forms that it handles?


Solution

  • There are several options here.

    1. FDF, Form Data Format. And that's a terrible spec document, it only covers a small (infrequently used, complicated) part of the FDF format. FDF files are fairly trivial to generate, and contain a pile of field/value pairs (and can hold list options, and other fancier stuff you won't need) and a file association. Opening the FDF fills the PDF (via a file association with acrobat/reader).

    Here's a sample (with extra whitespace to make it more readable):

    %FDF-1.2
    1 0 obj
    << /FDF
      << /Fields  [
        << /V (Communications Co.)/T (Address1)>>
        << /V (29 Communications Road)/T (Address2)>>
        << /V (Busyville)/T (City)>>
        << /V (USA)/T (Country)>>
        << /V (24 January 2000)/T (Date)>>
        << /V (Suzie Smith)/T (Name)>>
        << /V (\(807\) 221-9999)/T (PhoneNumber)>>
        << /V (777-11-8888)/T (SSN)>>
        << /V (NJ)/T (State)>>
      ]
      /F (TestForm.pdf)
      >>
    >>
    endobj
    trailer
    <<
      /Root 1 0 R
    >>
    %%EOF
    

    "/V" indicates a field value, "/T" is a field's title. "/F" is the path to the form to be filled.

    There are a number of mail-merge-esque products floating around that can take in an FDF and PDF and produce a filled PDF form. The aforementioned iText (and several others) can do this programmatically, other apps have command lines.

    Any page that might need to be repeated should be it's own form in this environment. Merging forms can be Quite Hard. There are a couple approaches, the easiest of them being to "flatten" the fields so they are just page contents (line art & text)... then you're not really merging PDF forms any more.

    Of course if you can control the order in which things are printed, you needn't merge the forms at all. You could just open/print them in the correct order.

    As I recall, Acrobat Pro's batch commands can import FDF data and print. All you'd need to do would be to generate the appropriate FDF files, which is mostly trivial string building.

    Using FDF presumes you have the PDF forms already made, just waiting to be filled out. If that's not the case...

    1. Generate your PDF forms programmatically. I do this with iText (the Java basis of iTextSharp), though there are quite a few libraries available in various languages. iText[Sharp] is licensed under the AGPL (or commercially). With AGPL, anyone with access to your OUTPUT must have access to the source of your application. AGPL is just as "viral" as the regular GPL. Older versions were available under the MPL.

    Given that this is strictly internal and that you'll be printing the PDFs, the licensing isn't much of an issue.

    It would be considerably more efficient to generate your form templates once then fill them in... either directly or via FDF.