I tried to print pages of a pdf document:
import PyPDF2
FILE_PATH = 'my.pdf'
with open(FILE_PATH, mode='rb') as f:
reader = PyPDF2.PdfFileReader(f)
page = reader.getPage(0) # I tried also other pages e.g 1,2,..
print(page.extractText())
But I only get a lot of blank space and no error message. Could it be that this pdf version (my.pdf) is not supported by PyPDF2?
This solved it (prints all pages of the document). Thanks
from pdfreader import SimplePDFViewer
fd = open("my.pdf", "rb")
viewer = SimplePDFViewer(fd)
for i in range(1,16): # need range from 1 - max number of pages +1
viewer.navigate(i)
viewer.render()
page_1_content=viewer.canvas.text_content
page_1_text = "".join(viewer.canvas.strings)
print (page_1_text)
Try pdfreader
from pdfreader import SimplePDFViewer
fd = open("my.pdf", "rb")
viewer = SimplePDFViewer(fd)
viewer.render()
page_0_content=viewer.canvas.text_content
page_0_text = "".join(viewer.canvas.strings)