Search code examples
pythonfilepdfpypdf

PyPDF module doesn' t make valid pdf file


i'm trying to make some program in python to manipulate my pdf beamer presentations. Professor use on click dynamic transition so one page has several click transitions. I want to print those presentations but i have around 5000 pages. So i want to use just the last click transition page, so i will minimize number of pages to around 500. I'm using PyPDF2 module but it not makes valid pdf file. Here's the code:

from pyPdf import PdfFileWriter, PdfFileReader
import os,sys

pdful = raw_input("Uneti ime fajla:")
output = PdfFileWriter()
input1 = PdfFileReader(open(pdful, "rb"))

m = []
f = True
print ("Uneti strane koje zelite da zadrzite.String 0 kraj unosa:\n")

while f:
   l = int(raw_input("Uneti broj stranice:"))
   if l == 0:
      f = not f
   else: m.append(l-1)

for i in range(len(m)):
    strana  = input1.getPage(int(m[i]))

    output.addPage(strana)

outputStream = file("Mat8.pdf","wb")
output.write(outputStream)
# string writings are in Serbian, but that's not so important. Program should take input from user: name of file to manipulate, and pages that should copy.

Solution

  • from pyPdf import PdfFileWriter, PdfFileReader pyPdf is discontinued already and is succeeded by PyPDF2. I am not sure about Python 2, but in Python 3 you should import PyPDF2.

    1. No need to import os, sys. However, you can call python3 xyz.py some_arg in bash if you did use sys.argv. This way sys.argv[1] == some_arg
    2. I would prefer using maps instead, as long as you don't need to read input line by line. For example,

      print ("Uneti strane koje zelite da zadrzite.String 0 kraj unosa:\n")
      m = map (lambda x: int(x) - 1, raw_input("Uneti broj stranice:").split())
      
    3. Instead of the while loop. Also, iterate over objects instead of indices.

      for page_number in m:
          strana = input1.getPage(page_number)
          output.addPage(strana)
      
    4. Finally, use with to enclose file operations. Python will automatically handle closing of the file, lest you forget to do so.

      with open (pdful, 'wb') as outputStream:
          output.write(outputStream)