Search code examples
python-3.xpdfencryptionpypdf

"PDF File has not been decrypted" issue still persists in PyPDF2


Getting the following errors while reading PDF files using PyPDF2

raise utils.PdfReadError("File has not been decrypted")
PdfReadError: File has not been decrypted

I have been trying to read PDF documents programmatically through python. For most of the PDF files it works fine but for few I get the following error

raise utils.PdfReadError("File has not been decrypted")
PdfReadError: File has not been decrypted

I have already tried solutions from another stackoverflow solution: PyPDF 2 Decrypt Not Working

This solution from above question still did not solve my issue

import os
import PyPDF2
from PyPDF2 import PdfFileReader

fp = open(filename)
pdfFile = PdfFileReader(fp)
if pdfFile.isEncrypted:
   try:
       pdfFile.decrypt('')
       print('File Decrypted (PyPDF2)')
   except:
       command = ("cp "+ filename +
        " temp.pdf; qpdf --password='' --decrypt temp.pdf " + filename
        + "; rm temp.pdf")
       os.system(command)
       print('File Decrypted (qpdf)')
       fp = open(filename)
       pdfFile = PdfFileReader(fp)
else:
    print('File Not Encrypted')

The issue does not seems to be space between file names or setting password as ' '.

Somehow not able to solve this error. Any help is appreciated. Thanks.

My code:

import PyPDF2
import os
from os import listdir
from os.path import isfile, join

mypath='D:/POC PDF'
onlyfiles = [os.path.join(mypath, f) for f in os.listdir(mypath) if os.path.isfile(os.path.join(mypath, f))]
for file in onlyfiles:
    fileReader = PyPDF2.PdfFileReader(open(file,'rb'))
    countpage = fileReader.getNumPages()
    print(countpage)

Solution

  • To answer my own question:Thanks to a friend of mine, I found a better package than PyPDF2. It is PyMuPDF. Here is a sample implementation

    import fitz
    
    def extractText(file): 
        doc = fitz.open(file) 
        text = []
        for page in doc: 
            t = page.getText().encode("utf8") 
            text.append(t)
        return text