This is my first python code. The writer passes an error. This seems to occur randomly during the process of looping through the pdf's.
try: except: pass
will not work because it will just skip the file with the issue and not produce an output for it.
strict=False
does not seem to work for the writer.
The error:
PdfReadWarning: Multiple definitions in dictionary at byte 0x6eb54 for key /PageMode [generic.py:587]
PdfReadWarning: Multiple definitions in dictionary at byte 0x75740 for key /PageMode [generic.py:587]
PdfReadWarning: Multiple definitions in dictionary at byte 0xabc13 for key /PageMode [generic.py:587]
Traceback (most recent call last):
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:\Users\kmincey.BCSBLOCAL\.vscode\extensions\ms-python.python-2022.4.0\pythonFiles\lib\python\debugpy\__main__.py", line 45, in <module>
cli.main()
File "c:\Users\kmincey.BCSBLOCAL\.vscode\extensions\ms-python.python-2022.4.0\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 444, in main
run()
File "c:\Users\kmincey.BCSBLOCAL\.vscode\extensions\ms-python.python-2022.4.0\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 268, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "c:\Users\kmincey.BCSBLOCAL\Desktop\Python_scripts\PDFsealer_V2.py", line 56, in <module>
output_pdf.write(f)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 482, in write
self._sweepIndirectReferences(externalReferenceMap, self._root)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 571, in _sweepIndirectReferences
self._sweepIndirectReferences(externMap, realdata)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 547, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, value)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 556, in _sweepIndirectReferences
value = self._sweepIndirectReferences(externMap, data[i])
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 577, in _sweepIndirectReferences
newobj = data.pdf.getObject(data)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\pdf.py", line 1611, in getObject
retval = readObject(self.stream, self)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\generic.py", line 66, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\generic.py", line 579, in readFromStream
value = readObject(stream, pdf)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\generic.py", line 68, in readObject
return readHexStringFromStream(stream)
File "C:\Users\kmincey.BCSBLOCAL\AppData\Local\Programs\Python\Python39\lib\site-packages\PyPDF2\generic.py", line 311, in readHexStringFromStream
raise PdfStreamError("Stream has ended unexpectedly")
PyPDF2.utils.PdfStreamError: Stream has ended unexpectedly
I have read several post regarding the issue of needing to put strict=False
in the reader to pass warnings and not errors. https://stackoverflow.com/questions/42570432/pypdf2-stream-has-ended-unexpectedly, https://github.com/mstamy2/PyPDF2/issues/99. This worked in most cases however, the writer now seems to be the problem.
Thanks in advance for any advice.
For loop snippet for reference:
for file in input_pdf:
output_pdf = PdfFileWriter()
sg.OneLineProgressMeter('My Meter', i, page_count, 'And now we Wait.....')
PageObj = PyPDF2.PdfFileReader(open(file, "rb"), strict=False).getPage(0)
PageObj.scaleTo(11*72, 17*72)
PageObj.mergePage(Seal_pdf.getPage(0))
output_pdf.addPage(PageObj)
output_filename = f"{file}"
f = open(output_filename, "wb+")
output_pdf.write(f)
i = i + 1
f.close()
Due to the helpful input from @cards and @KJ, I was able to discover that the problem was my attempting to overwrite an in use file. The fact that the original was still tied up in memory would corrupt it once reaching the writer. Simply saving the file under a different name and writing some more code to clean up the directory was the solution I went with. Thanks for the assist.