I do realize this has already been addressed here (e.g., Writing a Python Pandas DataFrame to Word document, python-docx: Parse a table to Panda Dataframe). Nevertheless, I hope this question was different.
I have used value_counts()
and generated a DataFrame as below:
df = sns.load_dataset('tips')
object_cols = list(df.select_dtypes(exclude=['int', 'float', 'int64', 'float64', 'int32', 'float32']).columns)
# Value Count & Percentage for object columns
c = df[object_cols].apply(lambda x: x.value_counts()).T.stack().astype(int)
p = (df[object_cols].apply(lambda x: x.value_counts(normalize=True)).T.stack() * 100).round(2)
cp = pd.concat([c,p], axis=1, keys=['Count', 'Percentage %'])
cp
The DataFrame looks like:
Count Percentage %
sex Female 87 35.66
Male 157 64.34
smoker No 151 61.89
Yes 93 38.11
day Fri 19 7.79
Sat 87 35.66
Sun 76 31.15
Thur 62 25.41
time Dinner 176 72.13
Lunch 68 27.87
I'm trying to add the above DataFrame as a table in a document using python-docx
import docx
from docx import Document
doc = Document()
doc.add_paragraph("Value Counts: ")
t = doc.add_table(cp.shape[0]+1, cp.shape[1])
# Set table style
t.style = 'Colorful List Accent 1'
# add the header rows.
for j in range(cp.shape[-1]):
t.cell(0,j).text = cp.columns[j]
# add the rest of the data frame
for i in range(cp.shape[0]):
for j in range(cp.shape[-1]):
t.cell(i+1,j).text = str(cp.values[i,j])
filename = "output/ValueCOunts_Report.docx"
# save the docx
doc.save(filename)
I'm able to add the table as
Count Percentage %
87 35.66
157 64.34
151 61.89
.....
.....
.....
How can I add the complete DataFrame with indexes as a table to document?
This is a little hacky solution as it brings the indexes to columns and manipulates the columns to look like index:
Reset the index and make use of series.duplicated
with np.where
to fill the repeating values of the column with blank
cp = cp.rename_axis(['Attr','Val']).reset_index()
cp['Attr'] = np.where(cp['Attr'].duplicated(),'',cp['Attr'])
Then executing your code gives the following output: