Search code examples
pythonexcelpandasdataframeexport-to-excel

How to export pandas dataframe with Multi-index columns to Excel with column name in one level unmerged and column name in another level merged?


I have a pandas dataframe df which looks as follows:

Germany
Population  GDP GDP GDP CO2
2015    2020    2015    2020    2010    2020    2030
0   26572   28985   25367   32194   44835   14415   45785
1   12372   41730   35112   37214   40748   16088   46963
2   26480   46811   37487   30372   48703   37997   43135

The columns are Multi-Index consisting of 3 levels. First level has Germany as country. Second level has some indicators, and third level has years. And there are some data in the pandas dataframe.

I'd like to export this dataframe to Excel such that I get Germany in each column. Then I'd like to have Population, GDP and CO2 merged for the second level. It should look something as shown:enter image description here

When I used df.to_excel(file, merge_cells = True), I get something as shown:enter image description here But I don't want to merge column for Germany, and want to have it split as shown in the screen shot on top. It also does not work using merge_cells = False as an argument because then the column names in different levels are concatenated as a column name in one row.

What would be the suitable approach to export it accordingly? (If possible, can the empty row between year and values in the exported file also be removed?)

Note: df.to_dict() looks like this:

{('Germany', 'Population', 2015): {0: 26572, 1: 12372, 2: 26480},
 ('Germany', 'Population', 2020): {0: 28985, 1: 41730, 2: 46811},
 ('Germany', 'GDP', 2015): {0: 25367, 1: 35112, 2: 37487},
 ('Germany', 'GDP', 2020): {0: 32194, 1: 37214, 2: 30372},
 ('Germany', 'GDP', 2010): {0: 44835, 1: 40748, 2: 48703},
 ('Germany', 'CO2', 2020): {0: 14415, 1: 16088, 2: 37997},
 ('Germany', 'CO2', 2030): {0: 45785, 1: 46963, 2: 43135}}

Solution

  • One trick is write first level of MultiIndex separately and then skip first row for write another values - DataFrame without first level:

    writer = pd.ExcelWriter('data.xlsx')
    
    df1 = pd.DataFrame(columns=df.droplevel([1,2], axis=1).columns)
    df2 = df.droplevel(0, axis=1)
    
    df1.to_excel(writer, sheet_name='Sheet1')
    df2.to_excel(writer, sheet_name='Sheet1', merge_cells = True, startrow=1)
    
    writer.close()