Search code examples
c#pdfannotationsappearancepdftron

How can I save the memory stream of a pdf annotation from PDFTron?


I have read some articles about the pdf format and I want to read annotations from a pdf document and save its appearance and data into a database.

Therefore I found out, that a annotation is fully represented by its "stream", I would expect a binary array, which every pdf viewer convert to the correct appearance. But how can I extract this information with SDKs like PDFTron ?

Or should I design a correct model of each annotation to manually extract the most important values of a annotation?

Greetings and Thanks for Answers!


Solution

  • Great question.

    that a annotation is fully represented by its "stream" The appearance stream yes, but annotations have a lot of metadata, such as created and last modified dates, author, location+size, flags and properties defining the appearance.

    Fortunately the PDF ISO format describes a way to exchange annotations outside of the PDF format. This is done through the FDF format, which is just a PDF with only annotation information (or form fields).

    The FDF data will contain all the annotation information including the appearance stream.

    With PDFNet you would export the annotation this way using FDFExtract.

    ArrayList annotations = new ArrayList();
    annotations.Add(annot);
    FDFDoc fdfdoc = pdfdoc.FDFExtract(annotations);
    fdfdoc.Save(tempFileLocation);
    byte[] data = System.IO.File.ReadAllBytes(tempFileLocation);
    

    Note, currently you have to write to disk first, but a FDFDoc.Save() api that returns a byte[] directly can be added for convenience.

    Then you would just import using the following code.

    FDFDoc fdfdoc = new FDFDoc(data, data.Length);
    pdfdoc.FDFMerge(fdfdoc);
    

    See this sample for more examples of FDF usage. https://www.pdftron.com/documentation/samples/cs/FDFTest

    A bonus of using FDF to store your annotations is that you have no vendor lock-in, the format is fully defined.