Search code examples
pythondjangodjango-viewspdf-generation

How to crop parts of a PDF with Python?


Situation: I am uploading a PDF file that contains an image.

The idea is to cut this PDF into 2 or 4 parts.

I am trying to apply the following logic:

Make a cut by dividing the file in half:

In my case, the file is always an A3 by default, measuring 29.7cm by 42cm.

So, dividing it into two proportional parts, the first half would be 29.7cm x 0-21cm and the second half would be 29.7cm x 21-42cm.

Then I would resize it to make each page return to being an A3, but containing only half of the initial content.

Then generating a PDF file that will have a total of 2 pages.

The result I hope to reproduce is something similar to this example:

enter image description here

But instead of working with images, it would be PDF.

My code try

from django.shortcuts import render
from django.http import HttpResponse
from .forms import CartazUploadForm
import PyPDF2
from io import BytesIO
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A3
from pdf2image import convert_from_bytes

def make_gradil(request):
    if request.method == 'POST':
        form = CartazUploadForm(request.POST, request.FILES)
        if form.is_valid():
            gradil = form.cleaned_data['gradil']
            cartaz_file = request.FILES['cartaz_file']

            # Create a new PDF writer
            writer = PyPDF2.PdfWriter()

            # Read the uploaded PDF
            reader = PyPDF2.PdfReader(cartaz_file)

            for page_num in range(len(reader.pages)):
                page = reader.pages[page_num]

                # Converter a página em uma imagem
                pdf_bytes = BytesIO()
                writer_temp = PyPDF2.PdfWriter()
                writer_temp.add_page(page)
                writer_temp.write(pdf_bytes)
                pdf_bytes.seek(0)

                images = convert_from_bytes(pdf_bytes.read())
                if not images:
                    continue

                # Obter a imagem da página
                image = images[0]

                # Criar um buffer para o novo PDF
                buffer = BytesIO()
                c = canvas.Canvas(buffer, pagesize=A3)

                # Dividir a imagem em duas metades
                left_half = image.crop((0, 0, image.width // 2, image.height))
                right_half = image.crop((image.width // 2, 0, image.width, image.height))

                # Redimensionar e adicionar a primeira metade
                left_half.save("left_half.png")
                c.drawImage("left_half.png", 0, 0, width=A3[0] / 2, height=A3[1], preserveAspectRatio=True)
                c.showPage()

                # Redimensionar e adicionar a segunda metade
                right_half.save("right_half.png")
                c.drawImage("right_half.png", 0, 0, width=A3[0] / 2, height=A3[1], preserveAspectRatio=True)
                c.showPage()

                c.save()
                buffer.seek(0)

                # Adicionar as páginas redimensionadas ao writer
                new_reader = PyPDF2.PdfReader(buffer)
                for new_page in new_reader.pages:
                    writer.add_page(new_page)

            # Create the response
            response = HttpResponse(content_type='application/pdf')
            response['Content-Disposition'] = 'attachment; filename="gradil.pdf"'

            # Write the PDF to the response
            with BytesIO() as output_buffer:
                writer.write(output_buffer)
                response.write(output_buffer.getvalue())

            return response
    else:
        form = CartazUploadForm()
    return render(request, 'core/index.html', {'form': form})

I'm using Django, and in the code above it gives an error complaining about poppeler, but wouldn't it be interesting to use another installed program as a requirement.

Does anyone have an idea for this mission?


Solution

  • Acrobat Print or via Command Line call the function is usually one type of imposition but called "Print as POSTER" or "Split" or "Grid".

    The "decimation" factor in this case is simply 2 x 2.

    The major problem from an unspecified source description is, will a scale be required before or after the split ? It generally does not matter when, as it's relative to the ratio desired. But each command line tool may do that differently.

    I suggest cpdf is perhaps the easiest utility to call that task in one OS line from Python but you may find PyMuPDF may be more integrated.

    enter image description here