Search code examples
pythonpdf

Unable to merge pdf watermark at the correct position


I'm writing a pdf editor in python with 3 function:

  1. Delete the last page
  2. Delete all blank page
  3. Add Water Mark

I used a standard Water Mark subprogram, but the water mark keep been added to the wrong location on a testing pdf. Is there something wrong with the code or the pdf? I'm thinking it's the pdf

from PyPDF2 import PdfMerger, PdfReader, PdfWriter

pdf_file = "sample.pdf"
watermark = "watermark.pdf"
merged = "result.pdf"

with open(pdf_file, "rb") as input_file, open(watermark, "rb") as watermark_file:
    input_pdf = PdfReader(input_file)
    watermark_pdf = PdfReader(watermark_file)
    watermark_page = watermark_pdf.pages[0]

    output = PdfWriter()

    for i in range(len(input_pdf.pages)):
        pdf_page = input_pdf.pages[i]
        pdf_page.merge_page(watermark_page)
        output.add_page(pdf_page)

    with open(merged, "wb") as merged_file:
        output.write(merged_file)

I saved it on replit.com so you can run it, the pdf is there as well https://replit.com/@ygp3737/NavyNotedTypes#main.py

pdf


Solution

  • The Problem here is that Python has already been used to corrupt a very good source imposition of a printable book. Which itself is a sampler from a publicly available MP4 Video book.

    From comments this is a "Minimal Example" of another "private" layout since the publicly printable books here are published in the Public Domain (WWW.littlefox.com), and thus unnecessary to attempt watermarking a part, especially as its an Adobe InDesign 17.0 (Windows) Imposition, hence already corrupted by a prior PyPDF2 modification.

    Here we can see the desired format includes a reversed double layout as page 2 is supposed to be folded or slit in a Print House.

    enter image description here

    I would suggest you get the source document and use that for printing rather than an inferior copy So when printed some of the 14 pages will be "upside down" and they can then be more easily rotated as PDF pages rather than fiddling the way PyPDF was used. I personally would not bother adding a Watermark to a published document, as they can be more easily removed than this more problematic issue of rotating printable output.

    https://res.littlefox.com/en/supplement/load_pdf/C0005234?_=242424

    enter image description here

    Programming Answer

    In your OS shell you need two lines of code.

    The first is to decimate the pages in half so could be the equivalence of

    Mutool poster -x 1 -y 2 -o book-out.pdf book-in.pdf
    

    That works very well for splitting 7 pages into 14

    The second program line is to rotate even pages except 2 so rotate 4, 6.... This is now the hardest part, since mutool has no easy way to selectively rotate. The result would only be the 6 rotated pages, and then we would have more problems.

    Thus both coded actions are best done another way.

    The simplest would be to replace the first mutool instruction with coherent

    cpdf -chop-h 421 in.pdf -o out.pdf
    

    However I cant show you that as is version 2.7+ and I only have last 32 bit prior version 2.6 ! the second line of code would then be use cpdf to rotate the selected pages.

    cpdf -decrypt-force -rotate 180 bat1.pdf 4,6,8,10,12,14 -o out.pdf
    

    All the pages will now be viewable correct way up.
    enter image description here

    Finally

    cpdf also has watermarking functions that can be applied in combination with the above 2 lines and in theory all 3 steps combined using AND in the code.

    For details see Chapter 8 Watermarks and Stamps in the manual for many text or image or import options. enter image description here