Search code examples
pythonpdfpypdf

How to reference bookmark as parent from PdfReader object using pypdf


I am attempting to merge a PDF document with bookmarks and a series of PDF's without bookmarks. I want to create bookmarks that are the child of the last bookmark in the original PDF, but when I run the following code:

def mergePDFfiles(pdffile):
    merger = PdfFileMerger()
    doc=PdfFileReader(open(pdffile,'rb'))
    merger.append(doc)
    doc_length = doc.getNumPages()
    outline = doc.getOutlines()
    parent = outline[-1] 
    merger.append(PdfFileReader(filename,'rb'),import_bookmarks=False)
    sub = merger.addBookmark("SUBBOOKMARK",doc_length,parent)

I get the error:

TypeError: 'NoneType' object has no attribute '__getitem__'

outline[-1] returns a destination object that is very similar to a bookmark object, but they seem to be different. Is there a way to convert the destination object into a bookmark object?


Solution

  • OK. I did some digging in the source code and found an undocumented method called findBookmark which returns bookmark object with matching title. This seems to work.

    def mergePDFfiles(pdffile):
        merger = PdfFileMerger()
        doc=PdfFileReader(open(pdffile,'rb'))
        merger.append(doc)
        doc_length = doc.getNumPages()
        outline = doc.getOutlines()
        parent = findBookmark(outline[-1].title)
        merger.append(PdfFileReader(filename,'rb'),import_bookmarks=False)
        sub = merger.addBookmark("SUBBOOKMARK",doc_length,parent)