Search code examples
pythonfilevariablesvalueerrorpypdf

Opening File Using a Variable that Contains File Name Python


Trying to read a pdf file thats name may change, however I have a preliminary script that contains the file name. So I successfully save that file name to a variable however when I try to open a file using that variable I get an error: "ValueError: embedded null byte"

I have tried a couple solutions for example I attempted using this solution, However I receive the same error. I have identified a work around using glob, since I can predict the file name (I know there will always be one PDF) however if possible I want to try to avoid using this solution in case in the future we have multiple PDFs to handle.

This is what I have:

pdfFileName = pdfFileName[132:220] # File path is correct, I have confirmed
objectPDF = open(pdfFileName,'rb')
pdfReader = PyPDF2.PdfFileReader(objectPDF)
pageObj = pdfReader.getPage(0)
print(pageObj.extractText())

My Error is:

Traceback (most recent call last):
  File "verify.py", line 48, in <module>
    objectPDF = open(pdfFileName,'rb')
ValueError: embedded null byte

What I would like is for the text of the pdf to be output to the console. The error is certainly with the way I'm reading the file, if I hard type the file path in it works as expected, but not when a variable is used with the exact same value as the string.


Solution

  • Place this: pdfFileName = pdfFileName.replace('\0','') before this: objectPDF = open(pdfFileName,'rb')

    What that code does is that it removes all "nulls` from the string, which allows everything to run properly.