Search code examples
pythonpdfpython-3.8pypdf

Trying to combine PDFs from multiple folders into one PDF for each folder


I'm very new to Python/programming, and am trying to automate an office task that is very time consuming.

I have multiple folders with PDFs. For each folder I need to combine the PDFs into one PDF, and save it inside the folder whose contents it's the sum of. I've gotten the contents of one folder combined, and saved to my desktop successfully using this:

import PyPDF2
import os
Path = '/Users/jlaw/Desktop/Testing/FolderName/'
filelist = os.listdir(Path)
pdfMerger = PyPDF2.PdfFileMerger(strict=False)
for file in filelist:
    if file.endswith('.pdf'):
        pdfMerger.append(Path+file)
pdfOutput = open('Tab C.pdf', 'wb')
pdfMerger.write(pdfOutput)
pdfOutput.close()`

With the below code I'm trying to do the above, but for all the folders in a specific directory. When I run this, I get "Tab C.pdf" files appearing correctly, but I'm unable to open them.

import PyPDF2
import os
Path = '/Users/jlaw/Desktop/Testing/'
folders = os.listdir(Path)
def pdf_merge(filelist, foldername):
    pdfMerger = PyPDF2.PdfFileMerger()
    for file in filelist:
        if file.endswith('.pdf'):
            pdfMerger.append(Path+foldername+"/"+file)
        pdfOutput = open(Path+foldername+'/Tab C.pdf', 'wb')
        pdfMerger.write(pdfOutput)
        pdfOutput.close()
for folder in folders:
    pdf_merge(Path+'/'+folder, folder)`

I'm using Python Version: 3.8

The Tab C.pdf files are only 1kb in size. When I try and open with Adobe Acrobat, a pop up says, "There was an error opening this document. This file cannot be opened because it has no pages. If I try Chrome, it will open, but it's just an empty PDF, and with Edge (Chromium based) it says, 'We can't open this file. Something went Wrong"

Any pieces of advice or hints are much appreciated.


Solution

  • The below works. I'm not experienced enough yet to know why this is works, while the above doesn't.

    import PyPDF2
    import os
    Path = 'C:/Users/jlaw/Desktop/Testing/'
    folders = os.listdir(Path)
    pdfMerger = PyPDF2.PdfFileMerger()
    def pdf_merge(filelist): #Changed to just one argument
        pdfMerger = PyPDF2.PdfFileMerger()
        for file in os.listdir(filelist): #added os.listdir()
                if file.endswith('.pdf'):
                    pdfMerger.append(filelist+'/'+file) #replaced Path+foldername with filelist
        pdfOutput = open(Path+folder+'/Tab C.pdf', 'wb') #Moved back one tab to prevent infinite loop
        pdfMerger.write(pdfOutput) #Moved back one tab to prevent infinite loop
        pdfOutput.close() #Moved back one tab to prevent infinite loop
    for folder in folders:
        pdf_merge(Path+folder)` #Removed redundant + "/"