Search code examples
pythonpdfpypdfpython-pdfreader

Python does not print PDF with pyPDF2


I tried to print pages of a pdf document:

import PyPDF2
FILE_PATH = 'my.pdf'
with open(FILE_PATH, mode='rb') as f:
    reader = PyPDF2.PdfFileReader(f)
    page = reader.getPage(0) # I tried also other pages e.g 1,2,..
    print(page.extractText())

But I only get a lot of blank space and no error message. Could it be that this pdf version (my.pdf) is not supported by PyPDF2?

This solved it (prints all pages of the document). Thanks

from pdfreader import SimplePDFViewer
fd = open("my.pdf", "rb")
viewer = SimplePDFViewer(fd)
for i in range(1,16): # need range from 1 - max number of pages +1
    viewer.navigate(i)
    viewer.render()
    page_1_content=viewer.canvas.text_content
    page_1_text = "".join(viewer.canvas.strings)
    print (page_1_text)

Solution

  • Try pdfreader

    from pdfreader import SimplePDFViewer
    
    fd = open("my.pdf", "rb")
    viewer = SimplePDFViewer(fd)
    viewer.render()
    
    page_0_content=viewer.canvas.text_content
    page_0_text = "".join(viewer.canvas.strings)