Search code examples
pythonpdfpypdf

Applying a UDF into a for loop - Python


Example of PDF: "Smith#00$Consolidated_Performance.pdf"

The goal is to add a bookmark to page 1 of each PDF based on the filename.

(Bookmark name in example would be "Consolidated Performance")

import os
from openpyxl import load_workbook
from PyPDF2 import PdfFileMerger

cdir = "Directory of PDF" # Current directory
pdfcdir = [filename for filename in os.listdir(cdir) if filename.endswith(".pdf")]

def addbookmark(f):
    output = PdfFileMerger()
    name = os.path.splitext(os.path.basename(f))[0] # Split filename from .pdf extension
    dp = name.index("$") + 1 # Find position of $ sign
    bookmarkname = name[dp:].replace("_", " ") # replace underscores with spaces
    output.addBookmark(bookmarkname, 0, parent=None) # Add bookmark
    output.append(open(f, 'rb'))
    output.write(open(f, 'wb'))

for f in pdfcdir:
    addbookmark(f)

The UDF works fine when applied to individual PDFs, but it won't add the bookmarks when put into the loop at the bottom of the code. Any ideas on how to make the UDF loop through all PDFs within pdfcdir?


Solution

  • I'm pretty sure that the issue you're having has nothing to do with the loop. Rather, you're passing just the filenames and not including the directory path. It's trying to open these files in the script's current working directory (the directory the script is in, by default) rather than in the directory you read the filenames from.

    So, join the directory name with each file name when calling your function.

    for f in pdfcdir:
        addbookmark(os.path.join(cdir, f))