Search code examples
pythonpandasdataframemulti-index

Pandas add value to inner level of hierarchical index


I have a Pandas DataFrame with a hierarchical index (MultiIndex). I created this DataFrame by grouping values for "cousub" and "year".

annualMed = df.groupby(["cousub", "year"])[["ratio", "sr_val_transfer"]].median().round(2)
print annualMed.head(8)    

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0    

I would like to add an "overall" value in the "year" level that I could then populate with values based on a grouping of "cousub" alone, i.e., excluding "year". I would like the result to look like the following

                      ratio  sr_val_transfer
cousub          year                        
Allen Park city 2013   0.51          75000.0
                2014   0.47          85950.0
                2015   0.47          95030.0
                2016   0.45         102500.0
             Overall   0.50          90000.0
Belleville city 2013   0.49         113900.0
                2014   0.55         114750.0
                2015   0.53         149000.0
                2016   0.48         121500.0 
             Overall   0.50         135000.0

How can I add this new item to the "years" level of the MultiIndex?


Solution

  • If you want to just add these two columns explicitly, you could just specify all the MultiIndex levels with loc.

    df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)
    df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)
    

    If you had a whole list of cities that you wanted to add this row for however, this would be a bit tedious. Maybe you could append another DataFrame with the overall values with a bit of index manipulation.

    (df.reset_index()
       .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                             ['Belleville city', 'Overall', 0.5, 135000.]], 
                             columns=list(df.index.names) + list(df.columns)))
       .set_index(df.index.names)
       .sort_index())
    

    Demo

    Method 1 (smaller case)

    >>> df.loc[('Allen Park city', 'Overall'), :] = (0.50, 90000.)
    
    >>> df.loc[('Belleville city', 'Overall'), :] = (0.50, 135000.)
    
    >>> df.sort_index()
    
                             ratio  sr_val_transfer
    cousub          year                           
    Allen Park city 2013      0.51          75000.0
                    2014      0.47          85950.0
                    2015      0.47          95030.0
                    2016      0.45         102500.0
                    Overall   0.50          90000.0
    Belleville city 2013      0.49         113900.0
                    2014      0.55         114750.0
                    2015      0.53         149000.0
                    2016      0.48         121500.0
                    Overall   0.50         135000.0
    

    Method 2 (larger case)

    >>> (df.reset_index()
           .append(pd.DataFrame([['Allen Park city', 'Overall', 0.5, 90000.], 
                                 ['Belleville city', 'Overall', 0.5, 135000.]], 
                                 columns=list(df.index.names) + list(df.columns)))
           .set_index(df.index.names)
           .sort_index())
    
                             ratio  sr_val_transfer
    cousub          year                           
    Allen Park city 2013      0.51          75000.0
                    2014      0.47          85950.0
                    2015      0.47          95030.0
                    2016      0.45         102500.0
                    Overall   0.50          90000.0
    Belleville city 2013      0.49         113900.0
                    2014      0.55         114750.0
                    2015      0.53         149000.0
                    2016      0.48         121500.0
                    Overall   0.50         135000.0