Search code examples
pythonpdfplotlyreportlabpyfpdf

Python PDF package for presenting output generated by plotly charts maps and pandas dataframes with some text comments


Which python package would you recommend to generate a simple report containing:

  1. Tables from pandas dataframe, somewhat prettified
  2. Charts and maps generated by plotly
  3. Some comments to the two above

Google tells me it is either ReportLab or pyfpdf. Report lab looks like a massive overkill for my needs. Am I wrong? What is the simplest to learn package I can get away with?

Thank you

Googling, searching StackOverflow, searching udemy and youtube


Solution

  • pyfpdf is a library very easy to use and I'm sure that it can satisfy your needs.

    Versions of pyfpdf [fpdf2]

    In my applications, until now, I have always used the old version of pyfpdf (see here for documentation). This is a version very old so for new project I suggest you to use the new version called fpdf2.
    You can find it at this link on GitHub. At this link there are the source code and documentation.

    1. Create tables from pandas dataframe by pyfpdf

    I use always pyfpdf to create PDF file which organizes data inside tables. Below I add some links that contain example code to add table to a pdf by pyfpdf:

    • this post contains a complete code which takes data from a list and creates a table inside a PDF; the code uses the first element of the link to create the header of the table which is a bit different from the rest of the rows
    • this post shows how to create a table inside a PDF by the method cell() and multi_cell() of the class FPDF
    • this post contains code and consideration to justify a text inside a multi_cell and how to set the height of cell to host correctly all the text (if the text is long, it is split into multiple lines)
    • a code useful to write data in a table of a PDF by pyfpdf and pandas is presented below:
    from fpdf import FPDF
    import pandas as pd
    
    pdf = FPDF(format='letter', unit='in')
    
    pdf.add_page()
    pdf.set_font('helvetica', '', 8)
    
    pdf.ln(0.25)
    data = [
        [1, 'data(1,1)', 'data(1,2)', 'data(1,3)', 'data(1,4)'],
        [2, 'data(2,1)', 'data(2,2)', 'data(2,3)', 'data(2,4)'],
        [3, 'data(3,1)', 'data(3,2)', 'data(3,3)', 'data(3,4)'],
        [4, 'data(4,1)', 'data(4,2)', 'data(4,3)', 'data(4,4)'],
    ]
    
    df = pd.DataFrame(data)
    
    for index, row in df.iterrows():
        for data in row.values:
            pdf.cell(1.6, 0.5, str(data))  # write each data for the row in its cell
        pdf.ln()
    
    pdf.output('test.pdf', 'F')
    

    It creates a file called test_pdf.pdf and the content of the file PDF is showed in the following picture: enter image description here

    2. Add charts and maps (generated by plotly) to a PDF by pyfpdf

    Until now I have rarely added images to PDF files created by pyfpdf. To do this I have used the method image() of the class FPDF. This method is able to add images stored in files JPEG, PNG and GIF.

    I suggest you to refer to this link to get a simple example of how to integrate the use of pyfpdf and plotly. Also the code showed in this link uses the method image() of the class FPDF to add graphs to a PDF file. The example uses the new version fpdf2.

    The core of the example code to add image to a PDF by fpdf is the following:

    import io
    from fpdf import FPDF
    import plotly.graph_objects as go
    
    # code to create the object 'fig' by ploty and the object 'image_data'
    fig = go.Figure()
    fig.add.....
    
    # Convert the figure to png using kaleido
    image_data = fig.to_image(format="png", engine="kaleido")
    
    # Create an io.BytesIO object which can be used by FPDF2
    image = io.BytesIO(image_data)
    
    pdf = FPDF()
    pdf.add_page()
    # the image is added to the pdf instance of FPDF
    pdf.image(image, w=pdf.epw)  # Width of the image is equal to the width of the page
    pdf.output("plotly_demo.pdf")