Search code examples
.netpdfsvgannotationspdftron

PDFTron. Convert annotations to svg


I need to scan pdf document, extract some metadata from annotations, get it svg representation, and save it to database. I am using PDFTron and .NET for pdf processing.

During my research, I have found two ways to do it:

  1. Extract fdf data from initial document. Lets name it in_pdf
  2. Create empty pdf file and merge it with fdf doc. So I can get pdf only with annotations. Lets name it temp_pdf
  3. Convert temp_pdf to svg.
  4. Open in_pdf, and try to find corresponding svg tag for every annotation. But I do not know, how to find corresponding tag

The second way:

  1. Extract fdf data from initial document for every annnotaion. In fact, make separate fdf for every annotaion.
  2. Merge it with empty temp_pdf. In fact, make separate pdf for every annotation.
  3. Convert each temp_pdf to svg. Using this way, gives me mapping between each annnotation and its svg string. But causes creation many temporary documents.

All the stuff would be much simplier, if I have some instrument to convert each annotation in svg directly, not the whole document. Is there a way to do it, using PDFTron?


Solution

  • You can export the appearance of annotations to a PDF page, and then you can convert that page to SVG.

    This forum post shows how to render a specific annotation to an image. https://groups.google.com/d/msg/pdfnet-sdk/s8eeLmyNuGc/b_0gA02He3IJ

    To customize that code to your use case, the following should work great. For SVG generation you can do the following.

    Page temp_page = doc.PageCreate();
    temp_page.AnnotPushBack(annot);
    annot.Flatten(temp_page); // move annotation content stream into page content stream, and remove the annotation
    temp_page.SetMediaBox(temp_page.GetVisibleContentBox())
    Convert.ToSvg(temp_page, "out_path", svg_options);
    

    From here you can use standard XML tools to merge this SVG content to your target SVG file.

    To position the annotation, and size, the annotation, you would call

    annot.GetRect()
    

    The x1,y1 values give you the bottom left, corner, and x2,y2 gives you top right corner.

    The generated SVG output has the same scale as the PDF, so you can use the values as is.