Search code examples
pythonpandas

How to merge separated dictionaries in order to make a single DataFrame?


My input is a Python list :

l = [
    {'name': 'foo', 'data': [{'id': 1}, {'type': 'type1'}, {'class': 'A'}]},
    {'name': 'bar', 'data': [{'id': 2}, {'type': 'type2'}, {'class': 'B'}]}
    ]

And my intermediate objective (maybe an XY but I need it anyways) is to make a dict like this :

new_d  = {
    'name': ['foo', 'bar'],
    'id': [1, 2],
    'type': ['type1', 'type2'],
    'class': ['A', 'B']
    }

Then my final expected output is this dataframe :

name  id  type class
 foo   1 type1     A
 bar   2 type2     B

I tried the approach below but I'm getting an error :

new_d = {}

for d in l:
    new_d = {'name': d['name'], **d['data']}

df = pd.DataFrame(new_d)

TypeError: 'list' object is not a mapping

Can you help me fix my code please ?


Solution

  • Lets use ChainMap to flatten the nested list of dict

    from collections import ChainMap
    
    df = pd.DataFrame(ChainMap({'name': d['name']}, *d['data']) for d in l)
    

    Resulting dataframe

    print(df)
    
      name class   type  id
    0  foo     A  type1   1
    1  bar     B  type2   2
    

    Intermediate dictionary

    print(df.to_dict('list'))
    
    {'name': ['foo', 'bar'],
     'class': ['A', 'B'],
     'type': ['type1', 'type2'],
     'id': [1, 2]}