Search code examples
c#pdfclown

PDFClown Find and replace page


Using pdfclown,

I was wondering the best practice to find a page in a Existing PDF doc, and replace with a page from another PDF doc.

I have the bookmark and pagelabel of both pages.


Solution

  • A simple example for replacing pages can be derived from the PageManager cli examples:

    string inputA = @"A.pdf";
    string inputB = @"B.pdf";
    string output = @"A-withPage1FromB-simple.pdf";
    
    org.pdfclown.files.File fileA = new org.pdfclown.files.File(inputA);
    org.pdfclown.files.File fileB = new org.pdfclown.files.File(inputB);
    
    // replace page 0 in fileA by page 0 from fileB
    Document mainDocument = fileA.Document;
    Bookmarks bookmarks = mainDocument.Bookmarks;
    PageManager manager = new PageManager(mainDocument);
    manager.Remove(0, 1);
    manager.Add(0, fileB.Document.Pages.GetSlice(0, 1));
    
    fileA.Save(output, SerializationModeEnum.Standard);
    

    This indeed replaces the first page in A.pdf by the first page in B.pdf and saves the result as A-withPage1FromB-simple.pdf.

    Unfortunately, though, the PageManager does not update bookmarks. In the result of the code above, therefore, there still is a bookmarks which used to point to the original first page; as this page is not there, anymore, it now points nowhere anymore. And the bookmark pointing to the first page in fileB, is ignored completely.

    Other document level, page related properties also are not transferred, e.g. the page label. In case of the page labels, though, the original label for the first page remains associated to the first page after replacement. This is due to a different kind of reference (by page number, not by object).