Search code examples
pythonpandasdataframedefaultdict

DataFrame from list of multiple dictionaries where keys are index and values is column data


I am trying to create a DataFrame with pandas from a list of dictionaries which looks like this:


[{'3600': '12', '7600': '1212343'}, {'0': 0.0, '3600': 0.0, '7200': 0.0, '10800': 0.0, '14400': 0.0, '18000': 0.0, '21600': 0.0, '25200': 116.93828280645994} .... ]

My columns is a list of items: ["col1", "col2" ...]

What I want is the keys of the dicts to be the index, the values of the dicts should fill the columns. In this example:

                       col1                  col2
0                       0/NaN                0.0
3600                    12                   0.0
7600                    1212343              NaN
7200                    NaN                  0.0
10800                   NaN                  0.0
18000                   NaN                  0.0
21600                   NaN                  0.0
25200                   NaN                  116.93828280645994

So the values of each dictionary basically represent column values. Since the dicts can be of different sizes I need to add NaN.

I thought I had this already figured out with the help here(Create a Dataframe from list of Dictionaries) like this:


    columns = ["col", "col2" ...]
    df_data = mydataasabove

    final_dict = defaultdict(list)

    for data in df_data:
        for key, value in data.items():
            final_dict[key].append(value)

    final_dict = dict(final_dict)

    df = pd.DataFrame.from_dict(final_dict, orient='index', columns=columns)

But this gives me a df like this:

                     col1                     col2
3600                   12                    0.0
7600              1212343                    NaN
0                       0                    NaN
7200                    0                    NaN
10800                   0                    NaN
14400                   0                    NaN
18000                   0                    NaN
21600                   0                    NaN
25200             116.938                    NaN

As you can see the values do not correspond correctly to my columns. The output of printing final_dict is:

{'3600': ['12', 0.0], '7600': ['1212343'], '0': [0.0], '7200': [0.0], '10800': [0.0], '14400': [0.0], '18000': [0.0], '21600': [0.0], '25200': [116.93828280645994]}

I also tried something along the lines with Chainmap:

df = pd.DataFrame.from_dict(ChainMap(*nec_data), orient='index', columns=['col1']) but I couldn't add multiple columns.

Maybe someone can lend me a 🖐? It would be very much appreciated! Thanks in advance


Solution

  • You can read as a dataframe and transpose , then rename to adjust the column names with the help of df.rename and f-strings

    pd.DataFrame(d).T.rename(columns=lambda x: f"col{x+1}")
    

              col1     col2
    3600        12        0
    7600   1212343      NaN
    0          NaN        0
    7200       NaN        0
    10800      NaN        0
    14400      NaN        0
    18000      NaN        0
    21600      NaN        0
    25200      NaN  116.938