Search code examples
pythonpandasmulti-index

Adding a multindex to the columns of a pandas df based on several dictionaries


Assuming I have a DF like:

person_names = ['mike','manu','ana','analia','anomalia','fer']
df = pd.DataFrame(np.random.randn(5, 6), columns = person_names)
df

I also have two dictionaries, for easy purposes assuming only two:

# giving a couple of dictionaries like:
d = {'mike':{'city':'paris', 'department':2},
   'manu':{'city':'london', 'department':1},
   'ana':{'city':'barcelona', 'department':5}}
d2 = {'analia':{'functional':True, 'speed':'high'},
      'anomalia':{'functional':True, 'speed':'medium'},
      'fer':{'functional':False, 'speed':'low'}}

The result I would to achieve is a df having a multindex as shown in the excel screenshot here:

enter image description here

The dictionaries contain values for SOME of the column names.

Not only I need to create the multiindex based on dictionaries but take into account that the keys of the dictionaries are different for both dictionaries, I would also like to keep the original names of the columns as first level of the multiindex.

Any suggestion?


Solution

  • Here's a way to do what your question asks for an arbitrary number of dictionaries, with the levels of the MultiIndex in the sequence shown in the OP:

    dcts = (d, d2)
    out = df.T
    all_keys = []
    for dct in reversed(dcts):
        keys = list(reversed(dct[next(iter(dct))].keys()))
        all_keys += keys
        out[keys] = [[dct[name][key] if name in dct else None for key in keys] for name in out.index]
    out = out.reset_index().rename(columns={'index':'person name'}).set_index(all_keys).T
    

    Output:

    speed             NaN                          high    medium       low
    functional        NaN                          True      True     False
    department        2.0       1.0       5.0       NaN       NaN       NaN
    city            paris    london barcelona       NaN       NaN       NaN
    person name      mike      manu       ana    analia  anomalia       fer
    0             -0.8392 -0.531491  0.191687  1.147941  0.292531    0.4075
    1            0.644307  0.550632 -0.241302   2.20928  -0.91035 -0.514395
    2            0.305059  0.123411  0.846471 -0.151596 -0.410483 -1.661958
    3           -0.763649  0.109589  2.215242  0.530734  0.261605  0.472373
    4            0.153886 -0.782931  1.179735 -0.339259  0.314743 -0.088172
    

    Note that because of MultiIndex grouping, some values of the levels speed and functional are shown just once and not repeated for columns to the right with the same level value.