Search code examples
pythonpandasdataframedictionarykey

Save dictionary to Pandas dataframe with keys as columns and merge indices


I know there are already lots of posts on how to convert a pandas dict to a dataframe, however I could not find one discussing the issue I have. My dictionary looks as follows:

[Out 23]:
{'atmosphere':       0
 2     5
 9     4
 15    1
 26    5
 29    5
 ...  ..
 2621  4
 6419  3
 
 [6934 rows x 1 columns],
 'communication':       0
 13    1
 15    1
 26    1
 2621  2
 3119  5
 ...  ..
 6419  4
 6532  1
 
 [714 rows x 1 columns]

Now, what I want is to create a dataframe out of this dictionary, where the 'atmosphere' and 'communication' are the columns, and the indices of both items are merged, so that the dataframe looks as follows:

index    atmosphere    commmunication
2           5
9           4
13                           1
15          1                1
26          5                1
29          5
2621        4                2
3119                         5
6419        3                4
6532                         1

I already tried pd.DataFrame.from_dict, but it saves all values in one row. Any help is much appreciated!


Solution

  • Use concat with DataFrame.droplevel for remove second level 0 from MultiIndex in columns:

    d = {'atmosphere':pd.DataFrame({0: {2: 5, 9: 4, 15: 1, 26: 5, 29: 5, 
                                        2621: 4, 6419: 3}}),
         'communication':pd.DataFrame({0: {13: 1, 15: 1, 26: 1, 2621: 2,
                                           3119: 5, 6419: 4, 6532: 1}})}
    
    print (d['atmosphere'])
          0
    2     5
    9     4
    15    1
    26    5
    29    5
    2621  4
    6419  3
    
    print (d['communication'])
          0
    13    1
    15    1
    26    1
    2621  2
    3119  5
    6419  4
    6532  1
    

    df = pd.concat(d, axis=1).droplevel(1, axis=1)
    print (df)
          atmosphere  communication
    2            5.0            NaN
    9            4.0            NaN
    13           NaN            1.0
    15           1.0            1.0
    26           5.0            1.0
    29           5.0            NaN
    2621         4.0            2.0
    3119         NaN            5.0
    6419         3.0            4.0
    6532         NaN            1.0
    

    Alternative solution:

    df = pd.concat({k: v[0] for k, v in d.items()}, axis=1)