Search code examples
pythonpandasms-wordpython-3.4pywin32

Writing a pandas dataframe to a word document table via pywin32


I am currently working on a script that needs to write to a .docx file for presentation purposes. I use pandas to handle all my data calculations in the script. I am looking to write a pandas dataframe into a table at a bookmark in a word.docx file using PyWIN32. The dataframe consists of floats. The psuedo code is something like this.

frame = DataFrame(np.arange(28).reshape((4,7)), columns=['Text1',...'Text7'])

With pywin32 imported...

wordApp = win32.gencache.EnsureDispatch('Word.Application')
wordApp.Visible = False
doc = wordApp.Documents.Open(os.getcwd()+'\\template.docx')
rng = doc.Bookmarks("PUTTABLEHERE").Range
rng.InsertTable.here

Now i would like to create a table at this bookmark. The dimensions of the table should be dictated by the dataframe. I would also like the column titles to be the header in the Word table.


Solution

  • Basically, all you need to do is create a table in word and populate the values of each cell from the corresponding values of data frame

    # data frame
    df= DataFrame(np.arange(28).reshape((4,7)), columns=['Text1',...'Text7'])
    
    wordApp = win32.gencache.EnsureDispatch('Word.Application')
    wordApp.Visible = False
    doc = wordApp.Documents.Open(os.getcwd()+'\\template.docx')
    rng = doc.Bookmarks("PUTTABLEHERE").Range
    
    # creating Table 
    # add one more row in table at word because you want to add column names as header
    Table=rng.Tables.Add(rng,NumRows=df.shape[0]+1,NumColumns=df.shape[1])
    
    for col in range(df.shape[1]):        
        # Writing column names 
        Table.Cell(1,col+1).Range.Text=str(df.columns[col]) 
        for row in range(df.shape[0]):
            # writing each value of data frame 
            Table.Cell(row+1+1,col+1).Range.Text=str(df.iloc[row,col])  
    

    Notice that Table.Cell(row+1+1,col+1) has been added two ones here. The reason is because Table in Microsoft Word start indexing from 1. So, both row and col has to be added 1 because data frame indexing in pandas start from 0.

    Another 1 is added at row to give space for data frame columns as headers. That should do it !