Search code examples
pythonpdfpdf-generation

How to mass produce PDF's efficiently in Python with different variable inputs into each PDF


I have recently begun a task of automating the PDF generation for investor relations clients. We need to send out PDF's en masse, but each PDF needs to have a unique logo and company name at the bottom corner (I have the logos stored in a folder and corresponding names stored in a txt file).

Furthermore, each page of the PDF is predefined, but there are a few variables that are custom, such as "This year, revenue has increased by X%". I also have the X for each company, etc.

Desired input: Company name and logo

Desired output: PDF with standard template however with changed names and logo

I have tried the following:

from FPDF import FPDF

pdfs = []

dct = {
    "company1": 5,
}

# minimal example of what I have tried, but doesn't work
for company in open("company_names.txt", "r").readlines()
    pdf = FPDF(orientation = 'P', unit = 'mm', format = 'A4')
    pdf.add_page()
    pdf.set_font('helvetica', 'bold', 10)
    pdf.add_text(company)
    pdf.add_text(f"Revenue has increased by {dct[company]}%" )
    pdf.add_picture(f"logos/{company}.png") # <-- this, among other things, don't work

    pdfs.append(pdf)

Any help would be appreciated. Speed increases would also be appreciated, as it needs to generate thousands of PDF's.


Solution

  • I would reccomend using the reportlab.pdfgen module. Docs can be found here: https://docs.reportlab.com/reportlab/userguide/ch2_graphics/

    To insert variable text I would reccomend to have your template saved as a .png file. The image will be the background on your generated PDF with the text in front, creating the illusion that the text was there all along.

    Generally the PDF's generation speed can't be changed, since you are opting for modules. However, I find that it's rare to need sub-second generation speeds for this purpose.

    I have made a project similar and it looks like this:

    from reportlab.pdfgen import canvas
    from reportlab.lib.pagesizes import letter
    import os
    
    company_name = input("Enter the company name: ") # You can implement your loop here
    
    pdf_filename = f"{company_name}_file.pdf"
    
    height_page = 720 # Change this to your template size
    width_page = 1270
    
    c = canvas.Canvas(pdf_filename, pagesize=(width_page,height_page))
    
    def add_text(text, x, y, size=12): # Creating this function makes it easier to insert text
        c.setFillColorRGB(255, 255, 255)
        c.setFont("Helvetica", size)
        c.drawString(x, y, text)
    
    add_text(f"{str(company_name)} is awesome!", 100, 200, 67) # Example of use
    
    # To insert logos you would use the c.drawImage()
    c.drawImage(image, 0, 0, width=width_page,height=height_page,mask=None)
    
    c.showPage() # Call this to create a new page
    
    c.save() # In order to save