I have earlier come up with this question in here: pypdf2-merging-pdf-pages-issue
Where I have now come a long way and can now create my PDF files from an Excel document via Pandas into PyPDF2.
As well as where I now have the number of pages that must be per. PDF. However, my problem now is that my merged PDF files are now blank.
If I do a debug, then I can see that in my second loop, which contains the variable "paths" the right paths to my physical PDF files. But that when they then come in through:
with path.open('rb') as pdf:
pdf_writer.append(pdf)
Then suddenly an extra "" enters the paths so that a path can be named c: \ users \ .... then suddenly it is called c: \ users \ ...
Do not know if this is what prevents the files from being opened and read correctly, and then merged into one PDF file.
Hope some can guide me as python for me is self taught. Or in some other way can explain to me why I get created some merged PDF files that are suddenly blank on 3 pages.
My code is:
import datetime #Handle date
import pandas as pd #Handle data from Excel Sheet (Data analysis)
import PyPDF2 as pdf2 #Handle PDF read and merging
from pathlib import Path #Handle path
#Skip ERROR-message: Xref table not zero-indexed. ID numbers for objects will be corrected.
#import sys
#if not sys.warnoptions:
# import warnings
# warnings.simplefilter("ignore")
PDF_PATH = Path('C:/Users/TH/PDF/')
EXCEL_FILENAME = 'Resources/liste.xlsx'
def main():
today = datetime.date.today() # The date now
next_week = today.isocalendar()[1] + 1 # 0=Year, 1=week
resources = pd.read_excel(EXCEL_FILENAME, sheet_name='Ark1')
for row in resources.itertuples():
year = row.Aargang
paths = [
(PDF_PATH / row.Oevelse1).with_suffix('.pdf'),
(PDF_PATH / row.Oevelse2).with_suffix('.pdf'),
(PDF_PATH / row.Oevelse3).with_suffix('.pdf'),
]
pdf_writer = pdf2.PdfFileMerger()
for path in paths:
with path.open('rb') as pdf:
pdf_writer.append(pdf)
with open(f'Uge {next_week} - {year} Merged_doc.pdf', 'wb') as output:
pdf_writer.write(output)
if __name__ == '__main__':
main()
@anon01 Thx
And Thx/credit to Sirius3.
It's something about the PyPDF2, how to use it and some bugs with it. So after edit the code to this it work.
import datetime #Handle date
import pandas as pd #Handle data from Excel Sheet (Data analysis)
from PyPDF2 import PdfFileMerger #Handle PDF read and merging
from pathlib import Path #Handle path
#Skip ERROR-message: Xref table not zero-indexed. ID numbers for objects will be corrected.
#import sys
#if not sys.warnoptions:
# import warnings
# warnings.simplefilter("ignore")
PDF_PATH = Path('C:/Users/TH/PDF')
EXCEL_FILENAME = 'Resources/liste.xlsx'
def main():
today = datetime.date.today() # The date now
next_week = today.isocalendar()[1] + 1 # 0=Year, 1=week
resources = pd.read_excel(EXCEL_FILENAME, sheet_name='Ark1')
for row in resources.itertuples():
year = row.Aargang
paths = [
(PDF_PATH / row.Oevelse1).with_suffix('.pdf'),
(PDF_PATH / row.Oevelse2).with_suffix('.pdf'),
(PDF_PATH / row.Oevelse3).with_suffix('.pdf'),
]
pdf_merger = PdfFileMerger()
for path in paths:
pdf_merger.append(str(path))
with open(f'Uge {next_week} - {year} Merged_doc.pdf', 'wb') as output:
pdf_merger.write(output)
pdf_merger.close()
if __name__ == '__main__':
main()