Search code examples
pythonpandaspandas-groupbypython-docx

How to write groupby grouped table from dataFrame to word document?


I am new to pandas and python-docx, have a table like this using groupby in pandas:

On printing the df using print(df)

Output:

Change Type      typeA        typeB     typeC    typeD    typeE    typeF
Component
A                  0            2        6         0         0       6
B                  0            3        2         1         1       3
C                  0            1        0         0         0       4
D                  0            2        2         0         0       3
E                  0            0        0         0         0       1
F                  0            3        0         0         1       2
G                  2            1        3         0         2       3
H                  0            0        0         0         0       1
I                  0            1        0         0         0       0

I am using the following to write this dataFrame df to word document:

t = doc.add_table(df.shape[0]+1, df.shape[1])

for j in range(df.shape[-1]):
    t.cell(0,j).text = df.columns[j]


for i in range(df.shape[0]):
    for j in range(df.shape[-1]):
        t.cell(i+1,j).text = str(df.values[i,j])

as specified in the answer: Writing a Python Pandas DataFrame to Word document

I am getting the following printed:

typeA        typeB     typeC    typeD    typeE    typeF
  0            2        6         0         0       6
  0            3        2         1         1       3
  0            1        0         0         0       4
  0            2        2         0         0       3
  0            0        0         0         0       1
  0            3        0         0         1       2
  2            1        3         0         2       3
  0            0        0         0         0       1
  0            1        0         0         0       0

I am newbie and hence unable to figure out where I am getting wrong, I want to print the whole of table?


Solution

  • It looks like you are not writing the index values of the dataframe to file. I have amended your script to show you can use df.index to access these and then write them to the first column.

    #...
    
    t = doc.add_table(df.shape[0]+2, df.shape[1]+1) #df.shape[1]+1 to add one extra column for row names
    
    for j in range(df.shape[-1]):
        t.cell(0,j+1).text = df.columns[j] 
    
    for i in range(df.shape[0]):
        for j in range(df.shape[-1]):
            t.cell(i+2,j+1).text = str(df.values[i,j])
    
    # write columns name to file
    t.cell(0,0).text = df.columns.name
    
    # write the index values to file
    t.cell(1,0).text = df.index.name
    
    for i, my_index in enumerate(df.index):
         t.cell(i+2,0).text = my_index