Search code examples
pythonexcelrowsmulti-indexpandas.excelwriter

Summary Row for a pd.DataFrame with multiindex


I have a multiIndex dataframe created with pandas similar to this one:

Dataframe

nest = {'A1': dfx[['aa','bb','cc']],
        'B1':dfx[['dd']],
        'C1':dfx[['ee', 'ff']]}

reform = {(outerKey, innerKey): values for outerKey, innerDict in nest.items() for innerKey, values in innerDict.items()}

dfzx = pd.DataFrame(reform)

MultiIndex DF

What I am trying to achieve is to add a new row at the end of the dataframe that contains a summary of the total for the three categories represented by the new index (A1, B1, C1).

I have tried with df.loc (what I would normally use in this case) but I get error. Similarly for iloc.

a1sum = dfzx['A1'].sum().to_list()
a1sum = sum(a1sum)

b1sum = dfzx['B1'].sum().to_list()
b1sum = sum(b1sum)

c1sum = dfzx['C1'].sum().to_list()
c1sum = sum(c1sum)

totalcat = a1sum, b1sum, c1sum

newrow = ['Total', totalcat]
newrow

dfzx.loc[len(dfzx)] = newrow

ValueError: cannot set a row with mismatched columns

#Alternatively
newrow2 = ['Total', a1sum, b1sum, c1sum]
newrow2

dfzx.loc[len(dfzx)] = newrow2

ValueError: cannot set a row with mismatched columns

How can I fix the mistake? Or else is there any other function that would allow me to proceed?

Note: the DF is destined to be moved on an Excel file (I use ExcelWriter).

The type of results I want to achieve in the end is this one (gray row "SUM"

dfzx in Excel


Solution

  • I came up with a sort of solution on my own.

    1. I created a separate DataFrame in Pandas that contains the summary.

    2. I used ExcelWriter to have both dataframes on the same excel worksheet.

    Technically It would be then possible to style and format data in Excel (xlsxwriter or framestyle seem to be popular modules to do so). Alternatively one should be doing that manually.

    Solution