Search code examples
pythonpdfreportlab

Creating and fixing PDF with margings, and text style in reportlab


I'm having an issue creating a report using ReportLab in Python.

I'm using a JSON file from an AWS S3 bucket. Here is a sample of the information:

from reportlab.lib.pagesizes import A1
from reportlab.pdfgen import canvas
from reportlab.lib.units import cm

    
company_info = [
    {'title': 'What were Harley\'s total sales for Q3?', 
     'context': 'Harley\'s total sales for Q3 were /n **$15 million**, representing a **10% increase** compared to Q2.'},
    {'title': 'Which region showed the highest sales?', 
     'context': 'The **North American region** showed the highest sales, contributing **$8 million** to the total.'},
    {'title': 'What was the percentage increase in sales for the European market?', 
     'context': 'The **European market** experienced a /n **12% increase** in sales, totaling **$4 million** for Q3.'},
    {'title': 'Did Harley\'s introduce any new products in Q3?', 
     'context': ('In Q3, Harley\'s made a significant impact with the introduction of **two new products** that have been well-received by the market. '
                 'The **Harley Davidson X1** is a cutting-edge motorcycle designed with advanced technology and performance enhancements, catering to the evolving needs of enthusiasts. '
                 'Alongside, the **Harley Davidson Pro Series** offers a range of high-performance accessories aimed at enhancing the riding experience. '
                 'These new products have been introduced in response to customer feedback and market trends, reflecting Harley\'s commitment to innovation and quality. '
                 'The product launches have been supported by comprehensive marketing efforts, including promotional events and digital campaigns, which have effectively generated excitement and increased consumer interest.')},
    {'title': 'What was the impact of the new product launches on sales?', 
     'context': ('The recent product launches had a substantial impact on Harley\'s sales performance for Q3. The introduction of the **Harley Davidson X1** and the **Harley Davidson Pro Series** '
                 'contributed an additional **$2 million** in revenue, accounting for approximately **13%** of the total Q3 sales. '
                 'These new products not only boosted overall sales but also enhanced Harley\'s market position by attracting new customers and increasing repeat purchases. '
                 'The successful integration of these products into the company\'s existing lineup demonstrates Harley\'s ability to innovate and adapt to market demands. '
                 'Ongoing customer feedback and sales data will continue to inform product development and marketing strategies, ensuring that Harley\'s maintains its competitive edge and meets consumer expectations.')},
]

I'm using this code to generate the report, and I want to include a background image.

def create_report(company_info, image_path, output_pdf_path):
    # Define margins
    top_margin = 12 * cm
    bottom_margin = 8 * cm
    left_margin = 2 * cm
    right_margin = 2 * cm
    
    # Create a canvas object
    c = canvas.Canvas(output_pdf_path, pagesize=A1)
    
    # Draw the background image
    c.drawImage(image_path, 0, 0, width=A1[0], height=A1[1])
    
    # Define text properties
    font_size = 18
    c.setFont("Helvetica-Bold", font_size)
    
    # Calculate y position
    y_position = A1[1] - top_margin
    
    # Write the content
    for i, item in enumerate(company_info):
        question = item.get('title', '')
        answer = item.get('context', '')
        
        # Write question number and question
        c.setFont("Helvetica-Bold", font_size)
        c.drawString(left_margin, y_position, f'{i + 1}. {question}')
        y_position -= font_size + 10  # Adjust for line spacing
        
        # Write answer
        c.setFont("Helvetica", font_size)
        c.drawString(left_margin, y_position, answer)
        y_position -= 2 * font_size + 30  # Adjust for space between question and answer
        
        # Check if y position is below the bottom margin
        if y_position < bottom_margin:
            c.showPage()
            c.drawImage(image_path, 0, 0, width=A1[0], height=A1[1])
            y_position = A1[1] - top_margin
            c.setFont("Helvetica-Bold", font_size)  # Reapply font settings for new page

    # Save the PDF
    c.save()


bg = 'bg_temp.jpg'  # Path to your background image
output_pdf_path = 'report_harleys.pdf'  # Path where the PDF will be saved

create_report(company_info, bg, output_pdf_path)

I haven't yet been able to achieve the following:

  • Justify the text (both the title and the context) so that it does not extend beyond the right margin of the page.
  • Ensure that new lines specified by \n and bold text indicated by ** in my JSON are correctly reflected in the PDF.

Here is what I'm currently getting:

enter image description here

I would like it to look something like this:

enter image description here

Notice how with there a \n we have a new line and text between ** are bold? reference with the first item of the company info.

What might I be missing? I've tried numerous tutorials but haven't achieved the desired outcome.


Solution

  • I have tried my best,

    and I have gotten a huge help from ChatGPT.

    i tried to make each ** text ** bold, but the problem is that i can not change the font each time i want bold, so ChatGPT added new line to each word needs bold,

    
    from reportlab.lib.pagesizes import A1
    from reportlab.pdfgen import canvas
    from reportlab.lib.units import cm
    
    
    
    def draw_wrapped_text(c, text, x, y, max_width):
        is_bold = False  # To track whether the text should be bold
        start_pos = 0
        y_offset = 1.2 * cm
    
        while start_pos < len(text):
            # Find the next bold markers
            if '**' in text[start_pos:]:
                bold_start = text.find('**', start_pos)
                bold_end = text.find('**', bold_start + 2)
    
                if bold_start > start_pos:
                    # Draw text before the bold marker
                    normal_text = text[start_pos:bold_start].replace('\n', ' ')
                    y = draw_text_segment(c, normal_text, x, y, max_width, is_bold=False)
    
                # Draw the bold text
                bold_text = text[bold_start + 2:bold_end].replace('\n', ' ')
                y = draw_text_segment(c, bold_text, x, y, max_width, is_bold=True)
    
                # Update the position after the bold text
                start_pos = bold_end + 2
            else:
                # Draw the remaining text as normal
                normal_text = text[start_pos:].replace('\n', ' ')
                y = draw_text_segment(c, normal_text, x, y, max_width, is_bold=False)
                break
    
        return y  # Return the updated y position after the text is drawn
    
    
    def draw_text_segment(c, text, x, y, max_width, is_bold):
        lines = []
        current_line = ""
        
        font = "Helvetica-Bold" if is_bold else "Helvetica"
        c.setFont(font, 12)
    
        for word in text.split():
            if c.stringWidth(current_line + " " + word, font, 12) <= max_width:
                current_line += " " + word
            else:
                lines.append(current_line.strip())
                current_line = word
        
        lines.append(current_line.strip())
    
        for line in lines:
            c.drawString(x, y, line)
            y -= 1.2 * cm
    
        return y
    
    # Create the canvas object
    c = canvas.Canvas("company_report.pdf", pagesize=A1)
    width, height = A1
    
    company_info = [
        {'title': "What were Harley's total sales for Q3?", 
         'context': "Harley's total sales for Q3 were \n **$15 million**, representing a **10% increase** compared to Q2."},
        
        {'title': "Which region showed the highest sales?", 
         'context': "The **North American region** showed the highest sales, contributing **$8 million** to the total."},
        
        {'title': "What was the percentage increase in sales for the European market?", 
         'context': "The **European market** experienced a \n **12% increase** in sales, totaling **$4 million** for Q3."},
        
        {'title': "Did Harley's introduce any new products in Q3?", 
         'context': ("In Q3, Harley's made a significant impact with the introduction of **two new products** that have been well-received by the market. "
                     "The **Harley Davidson X1** is a cutting-edge motorcycle designed with advanced technology and performance enhancements, catering to the evolving needs of enthusiasts. "
                     "Alongside, the **Harley Davidson Pro Series** offers a range of high-performance accessories aimed at enhancing the riding experience. "
                     "These new products have been introduced in response to customer feedback and market trends, reflecting Harley's commitment to innovation and quality. "
                     "The product launches have been supported by comprehensive marketing efforts, including promotional events and digital campaigns, which have effectively generated excitement and increased consumer interest.")},
        
        {'title': "What was the impact of the new product launches on sales?", 
         'context': ("The recent product launches had a substantial impact on Harley's sales performance for Q3. The introduction of the **Harley Davidson X1** and the **Harley Davidson Pro Series** "
                     "contributed an additional **$2 million** in revenue, accounting for approximately **13%** of the total Q3 sales. "
                     "These new products not only boosted overall sales but also enhanced Harley's market position by attracting new customers and increasing repeat purchases. "
                     "The successful integration of these products into the company's existing lineup demonstrates Harley's ability to innovate and adapt to market demands. "
                     "Ongoing customer feedback and sales data will continue to inform product development and marketing strategies, ensuring that Harley's maintains its competitive edge and meets consumer expectations.")}
    ]
    
    # Start writing the content
    y = height - 2 * cm  # Initial y position
    
    for item in company_info:
        title = item['title']
        context = item['context']
        
        # Draw the title in bold
        c.setFont("Helvetica-Bold", 16)
        c.drawString(2 * cm, y, title)
        y -= 1.5 * cm  # Space between title and context
        
        # Draw the context with wrapped text
        y = draw_wrapped_text(c, context, 2 * cm, y, width - 4 * cm)
        y -= 2 * cm  # Space between entries
    
    # Save the PDF
    c.save()
    
    

    I hope you use another library like FPDF2, it is better than this lib, unless you want features in this library only, but I do not see special features here.

    and this is using FPDF2

    pip install fpdf2

    this is the code using FPDF2

    i have written a lot of comments to help you understand everything in the code:

    from fpdf import FPDF
    
    
    
    
    company_info = [
        {'title': "What were Harley's total sales for Q3?", 
         'context': "Harley's total sales for Q3 were \n **$15 million**, representing a **10% increase** compared to Q2."},
        
        {'title': "Which region showed the highest sales?", 
         'context': "The **North American region** showed the highest sales, contributing **$8 million** to the total."},
        
        {'title': "What was the percentage increase in sales for the European market?", 
         'context': "The **European market** experienced a \n **12% increase** in sales, totaling **$4 million** for Q3."},
        
        {'title': "Did Harley's introduce any new products in Q3?", 
         'context': ("In Q3, Harley's made a significant impact with the introduction of **two new products** that have been well-received by the market. "
                     "The **Harley Davidson X1** is a cutting-edge motorcycle designed with advanced technology and performance enhancements, catering to the evolving needs of enthusiasts. "
                     "Alongside, the **Harley Davidson Pro Series** offers a range of high-performance accessories aimed at enhancing the riding experience. "
                     "These new products have been introduced in response to customer feedback and market trends, reflecting Harley's commitment to innovation and quality. "
                     "The product launches have been supported by comprehensive marketing efforts, including promotional events and digital campaigns, which have effectively generated excitement and increased consumer interest.")},
        
        {'title': "What was the impact of the new product launches on sales?", 
         'context': ("The recent product launches had a substantial impact on Harley's sales performance for Q3. The introduction of the **Harley Davidson X1** and the **Harley Davidson Pro Series** "
                     "contributed an additional **$2 million** in revenue, accounting for approximately **13%** of the total Q3 sales. "
                     "These new products not only boosted overall sales but also enhanced Harley's market position by attracting new customers and increasing repeat purchases. "
                     "The successful integration of these products into the company's existing lineup demonstrates Harley's ability to innovate and adapt to market demands. "
                     "Ongoing customer feedback and sales data will continue to inform product development and marketing strategies, ensuring that Harley's maintains its competitive edge and meets consumer expectations.")}
    ]
    
    
    
    
    
    ################################################################
    #
    # Here the \n in the text is treated as new line , so when you put it
    # in the text it will create a new line automatically
    #
    ################################################################
    
    
    
    
    # create the object of the PDF
    pdf = FPDF()
    # you can specifie everything you want inside the object
    # like:
    # orientation (str): possible values are "portrait" (can be abbreviated "P") or "landscape" (can be abbreviated "L"). Default to "portrait".
    # unit (str, int, float): possible values are "pt", "mm", "cm", "in", or a number. A point equals 1/72 of an inch, that is to say about 0.35 mm (an inch being 2.54 cm). This is a very common unit in typography; font sizes are expressed in this unit. If given a number, then it will be treated as the number of points per unit.  (eg. 72 = 1 in) Default to "mm".
    # format (str): possible values are "a3", "a4", "a5", "letter", "legal" or a tuple (width, height) expressed in the given unit. Default to "a4".
    # but i will not do add any thing , i want to make everything simple , to show you how comfirtoble you wil be when using FPDF2
    
    
    pdf.add_page()
    # pdf.set_font(family="Times", style="B", size=15) # bold
    # pdf.set_font(family="Times", style="", size=10) # normal we are using this right now
    # of course you can donwload fonts from google fonts an add them, using add_font , but i will use Times font
    
    
    for i in company_info:
        title = i.get("title")
        context = i.get("context")
    
        # you have to chose the font that you want to write the text.
        pdf.set_font(family="Times", style="B", size=12)
        pdf.write(text=title) # this writes the text with auto break line \n
        pdf.ln() # this adds new line
        pdf.ln()
    
        # because the context you want it as normal we will change that to normal
        pdf.set_font(family="Times", style="", size=12)
        pdf.write(text=context)
        pdf.ln()
        pdf.ln()
        pdf.ln()
        pdf.ln()
    
    
    
    
    # also it can make complex things like what you want , but it needs some work for example look at this final line
    # i will write some of it bold and some if it normal
    pdf.set_font(family="Times", style="", size=12)
    pdf.write(text="first ")
    pdf.write(text="second ")
    pdf.set_font(family="Times", style="B", size=12)
    pdf.write(text="third as bold ")
    pdf.set_font(family="Times", style="", size=12)
    pdf.write(text="fourth as ")
    pdf.set_font(family="Times", style="B", size=12)
    pdf.write(text="normal, but normal word is bold, hhhh.")
    
    
    
    
    # now save the file
    pdf.output("R_Student.pdf")
    # if you do not add any name to output() it will give to you as bytes.
    

    Thanks.