Search code examples
pythonpandasdictionarydata-manipulation

Create a pandas dataframe from a dictionary of dictionaries


I have a dictionary of dictionaries similar to the following:

data_dict = {
    'first_entry': {'a': 345, 'b': 8484}, 
    'second_entry': {'a': 3423, 'b': 848}
}

I would like to create a dataframe of all these values, as follows:

pd.DataFrame(
    [list(data_dict[key].values()) for key in data_dict.keys()],
    columns = list(data_dict[list(data_dict.keys())[0]].keys())) 

I'm a bit concerned about the approach taken here with respect to accessing the keys and such.

Note - in the above the values first_entry and second_entry are not reliable, but the values of a and b are reliable. In the actual data I have ~500 or so nested dictionaries, (so first_entry ... five_hundredth_entry using the above syntax).


Solution

  • You only need DataFrame.from_dict with orient='index'. We can reset the index at the end optionally.

    new_df = pd.DataFrame.from_dict(data_dict, orient='index').reset_index(drop=True)
    print(new_df)
    

    Output

          a     b
    0   345  8484
    1  3423   848