Search code examples
pythonpython-3.xpdfpypdf

How to split a PDF every 4 pages using PyPDF2 in python?


Found a sample code online that splits a pdf into 2 pages but couldn't figure to change it to 4 pages, any tips will be appreciated

   #!/usr/bin/env python3

from PyPDF2 import PdfFileWriter, PdfFileReader
import glob, sys

pdfs = glob.glob("*.pdf")

for pdf in pdfs:
    inputFile = PdfFileReader(open(pdf, "rb"))
    for i in range(inputFile.numPages // 2):
        output = PdfFileWriter()
        output.addPage(inputFile.getPage(i * 2))

        if i * 2 + 1 < inputFile.numPages:
            output.addPage(inputFile.getPage(i * 2 + 1))

        newname = "-" + str(i) + ".pdf"
        outputStream = open(newname, "wb")
        output.write(outputStream)
        outputStream.close()

Solution

  • After understanding how the code works line by line i was able to come up with a solution, although i believe it can still be improved but it got me the result i needed. In the for loop i had to increment the counter by 2 to avoid getting the same pages repeated in each file after each loop and i added a couple of addPage statements and it worked. Thanks to randomhacks.co.uk for coming up with the original code i stated in my question. Any improvements are welcome.

    #!/usr/bin/env python3
    
    from PyPDF2 import PdfFileWriter, PdfFileReader
    import glob, sys
    
    pdfs = glob.glob("*.pdf")
    
    for pdf in pdfs:
    inputFile = PdfFileReader(open(pdf, "rb"))
    for i in range(0, inputFile.numPages, 2):
        output = PdfFileWriter()
        output.addPage(inputFile.getPage(i * 2))
        output.addPage(inputFile.getPage(i * 2 + 1))
        output.addPage(inputFile.getPage(i * 2 + 2))
    
    if i * 2 + 3  < inputFile.numPages:
        output.addPage(inputFile.getPage(i * 2 + 3))
    
    newname = "pdf[:9]" +"-" + str(i) + ".pdf"
    
    outputStream = open(newname, "wb")
    output.write(outputStream)
    outputStream.close()