Search code examples
pythonpandasdictionarydataframemulti-index

What's the best way to create a Pandas MultiIndex from a list of dictionaries?


I have an iterative process that runs with different parameter values each iteration and I want to collect the parameter values and results and put them in a Pandas dataframe with a multi-index built from the sets of parameter values (which are unique).

Each iteration, the parameter values are in a dictionary like this say:

params = {'p': 2, 'q': 7}

So it is easy to collect them in a list along with the results:

results_index = [
    {'p': 2, 'q': 7},
    {'p': 2, 'q': 5},
    {'p': 1, 'q': 4},
    {'p': 2, 'q': 4}
]
results_data = [
    {'A': 0.18, 'B': 0.18},
    {'A': 0.67, 'B': 0.21},
    {'A': 0.96, 'B': 0.45},
    {'A': 0.58, 'B': 0.66}
]

But I can't find an easy way to produce the desired multi-index from results_index.

I tried this:

df = pd.DataFrame(results_data, index=results_index)

But it produces this:

                     A     B
{'p': 2, 'q': 7}  0.18  0.18
{'p': 2, 'q': 5}  0.67  0.21
{'p': 1, 'q': 4}  0.96  0.45
{'p': 2, 'q': 4}  0.58  0.66

(The index did not convert into a MultiIndex)

What I want is this:

        A     B
p q            
2 7  0.18  0.18
  5  0.67  0.21
1 4  0.96  0.45
2 4  0.58  0.66

This works, but there must be an easier way:

df = pd.concat([pd.DataFrame(results_index), pd.DataFrame(results_data)], axis=1).set_index(['p', 'q'])

UPDATE:

Also, this works but makes me nervous because how can I be sure the parameter values are aligned with the level names?

index = pd.MultiIndex.from_tuples([tuple(i.values()) for i in results_index], 
                                  names=results_index[0].keys())
df = pd.DataFrame(results_data, index=index)

        A     B
p q            
2 7  0.18  0.18
  5  0.67  0.21
1 4  0.96  0.45
2 4  0.58  0.66

Solution

  • I ran into this recently and it seems there's a slightly cleaner way than the accepted answer:

    results_index = [
        {'p': 2, 'q': 7},
        {'p': 2, 'q': 5},
        {'p': 1, 'q': 4},
        {'p': 2, 'q': 4}
    ]
    
    results_data = [
        {'A': 0.18, 'B': 0.18},
        {'A': 0.67, 'B': 0.21},
        {'A': 0.96, 'B': 0.45},
        {'A': 0.58, 'B': 0.66}
    ]
    
    index = pd.MultiIndex.from_frame(pd.DataFrame(results_index))
    
    pd.DataFrame(results_data, index=index)
    

    Outputs:

            A     B
    p q            
    2 7  0.18  0.18
      5  0.67  0.21
    1 4  0.96  0.45
    2 4  0.58  0.66