Search code examples
pythonpandasmulti-index

How to build a MultiIndex DataFrame from a dict of data and a dict of index levels


I'm struggling with the creation of this DataFrame

     A  B
x y      
a 1  2  1
  2  6  3
c 2  7  2

from these two dictionaries which seem sufficient:

data = {'A': [2,6,7],
        'B': [1,3,2]}

index = {'x': ['a', 'a', 'c'],
         'y': [1, 2, 2]}

I tried several methods without success, of which:

dfm = pd.DataFrame.from_records(data=data, index=index)

The index size is incorrect, 2 instead of 3. I know an alternative with splitting the index dict into an array and a list:

labels = [['a', 'a', 'c'], [1, 2, 2]]
mi = pd.MultiIndex.from_arrays(labels, names=['x','y'])
dfm = pd.DataFrame(data, index=mi)

but this isn't very elegant.


Solution

  • I think the cleanest way is to make index into a dataframe then convert that to a MultiIndex.

    mi = pd.MultiIndex.from_frame(pd.DataFrame(index))
    

    Result:

         A  B
    x y      
    a 1  2  1
      2  6  3
    c 2  7  2
    

    But it might be more memory-efficient to "unpack" the dict, avoiding an intermediate df.

    mi = pd.MultiIndex.from_arrays(list(index.values()), names=index.keys())
    

    (same result)