Search code examples
pythonpandasdictionarydata-analysis

split pandas single column(List of dict) and append as new keys of dict as new columns


Input :

df = pd.DataFrame({'a':[1,2], 
                   'b':[[{'x1':1,'x2':3},{'x1':4,'x2':1}],
                        [{'x1':5},{'x1':3,'x2':6}]], 
                   'c':[5,6]})

If I apply the operation

print(df['b'].apply(pd.Series))

Output is:

                    0                   1
0  {'x1': 1, 'x2': 3}  {'x1': 4, 'x2': 1}
1           {'x1': 5}  {'x1': 3, 'x2': 6}

Expected output :

X1_0       x2_0     x1_1         x2_1
1           3        4              1
5          NaN         3              6

should not use evel or literal_eval operations.

Adding image for more clear:


Solution

  • You can use concat with list comprehension:

    df1 = (pd.concat([pd.DataFrame(x) for x in df['b']], keys=df.index)
            .unstack()
            .dropna(how='all', axis=1)
            .sort_index(axis=1, level=1))
    df1.columns = ['{}_{}'.format(a,b) for a,b in df1.columns]
    print (df1)
    
       x1_0  x2_0  x1_1  x2_1
    0     1   3.0     4   1.0
    1     5   NaN     3   6.0