Search code examples
pandasdataframenumber-formattingpython-docx

How do I add formatted values from a Style Object into a Word document table using the docx module?


I am adding tables in a Word document by using data from a dataframe using pandas and the Python .docx module. I want the data values to appear in the Word document table with the format style I applied to the dataframe. Some columns have a Number format with comma separator {:,}, and some columns have a percentage format {:.2%}.

However, after I add the format style to the dataframe, the dataframe becomes a Style object. I can not then add values from a Style object to the table in Word.

How do I apply format styles to the values in the dataframe, so that they appear styled in the Word document table?

import pandas as pd
import docx
import openpyxl 
  
# initialize list of lists 
data = [[150000, 100000,.14565], [250000, 200000,.16334]]
  
# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['Revenues', 'Expenditures', 'Surplus']) 

# Apply style to pandas DataFrame 
df = df.style.format({"Revenues": "${:20,.0f}","Expenditures": "${:20,.0f}","Surplus": "{:.2%}"})

# Create the Word Document
doc = docx.Document('hello.docx')

# add a table to the end and create a reference variable
# extra row is so we can add the header row
t = doc.add_table(df.shape[0]+1, df.shape[1])

# add the header rows.
for j in range(df.shape[-1]):
    t.cell(0,j).text = df.columns[j]
    
# add the rest of the data frame
for i in range(df.shape[0]):
    for j in range(df.shape[-1]):
        t.cell(i+1,j).text = str(df.values[i,j])
        
doc.save('hello_python_output.docx')



Solution

  • Instead of using a style object (which is primarily used for rendering a dataframe in HTML). You can apply these conversions directly to the dataframe (thus making each column a string or object dtype) and write those string values to your word document. You can apply your formatting via the transform method:

    conversions = {
        "Revenues": "${:20,.0f}",
        "Expenditures": "${:20,.0f}",
        "Surplus": "{:.2%}"
    }
    
    new_df = df.transform({k: v.format for k, v in conversions.items()})
    
    print(new_df)
                    Revenues           Expenditures Surplus
    0  $             150,000  $             100,000  14.56%
    1  $             250,000  $             200,000  16.33%