Search code examples
pythonpdfpypdf

PyPDF2 corrupts file when watermarking


I have been trying to speed up our date stamping process by adding a stamp as a watermark to PDFs through PyPDF2. I found the code below online as I'm pretty new to coding.

When I run this it seems to work, but the file is corrupted and won't open. Does anyone have any ideas where I am going wrong?

from PyPDF2 import PdfFileWriter, PdfFileReader

 

def create_watermark(input_pdf, output_pdf, watermark):

    watermark_obj = PdfFileReader(watermark,False,)

    watermark_page = watermark_obj.getPage(0)

 

    pdf_reader = PdfFileReader(input_pdf)

    pdf_writer = PdfFileWriter()

 

    # Watermark all the pages

    for page in range(pdf_reader.getNumPages()):

        page = pdf_reader.getPage(page)

        page.mergePage(watermark_page)

        pdf_writer.addPage(page)

 

    with open(input_pdf, 'wb') as out:

        pdf_writer.write(out)

 

if __name__ == '__main__':

   

    input_pdf = "C:\\Users\\A***\\OneDrive - ***\\Desktop\\Invoice hold\\Test\\1.pdf"

    output_pdf = "C:\\Users\\A***\\OneDrive - ***\\Desktop\\Invoice hold\\Test\\1 WM.pdf"

    watermark = "C:\\Users\\A***\\OneDrive - ***\\Desktop\\Invoice hold\\WM.pdf"

 

    create_watermark(input_pdf,output_pdf,watermark)

Solution

  • If you want to save pdf file under the name of output_pdf,

    try this :

    result = open(output_pdf, 'wb')
    pdf_writer.write(result)
    

    your code :

    with open(input_pdf, 'wb') as out:
        pdf_writer.write(out)
    

    Your code is to overwrite input_pdf.

    And if there is a problem while working, the pdf file will be damaged.

    I succeeded in inserting the watermark by applying your code and my proposed method.

    I recommend checking if the pdf file is not damaged.