Pandas Dataframe

I have a set of data, which is a nested dictionary. While columns a and b have a single entry, column c is composed of d and e, which have an arbitrary number of elements of the same length.

For instance:

N = 5
nested_dict = {
"a": np.random.randn(N),
"b": np.random.randn(N),
"c": [{"d":np.random.randn(i+1), "e":np.random.randn(i+1)} for i in range(N)]
}

How do I convert this to a multi-index Pandas Dataframe, such that column c has subheadings d and e, each with the length of the array supplied?

Edit: See an example of the desired format below:

In addition, may I save and load this Dataframe as if it were a normal one?

Solution

Try something like:

import pandas as pd
import numpy as np

N = 5
nested_dict = {
    "a": np.random.randn(N),
    "b": np.random.randn(N),
    "c": [{"d": np.random.randn(i + 1), "e": np.random.randn(i + 1)} for i in range(N)]
}

df = pd.DataFrame(data=nested_dict)
# Normalize Nested Dict and merge back
# Set index to 'a', 'b' and unpack lists
df = df.drop(columns=['c']) \
    .merge(pd.json_normalize(df['c']),
           left_index=True,
           right_index=True) \
    .set_index(['a', 'b']) \
    .apply(lambda x: x.apply(pd.Series).stack())

# Add MultiIndex C back
df.columns = pd.MultiIndex.from_product([['c'], df.columns])

# For Display
print(df.to_string())

Output:

                              c          
                              d         e
a         b                              
-0.913707  1.015265 0  0.630905 -0.508003
 0.467421  1.880421 0  0.886313  0.026921
                    1 -0.720613  1.027585
-0.314128 -0.756686 0  0.317922 -0.431624
                    1 -1.154708 -0.370363
                    2  0.400752 -0.000786
 0.488310 -0.230924 0  1.303703 -1.414924
                    1  0.242020  1.401058
                    2 -0.369507  0.648304
                    3  1.491819  1.010083
 1.248220 -0.351634 0  0.106272  0.518489
                    1 -1.916420 -0.068814
                    2 -0.090406 -0.237604
                    3 -0.208762  0.163396
                    4  0.664643 -1.272215