Search code examples
pythonpandasdataframedata-cleaning

Pandas convert list of list to columns names and append values


I have to columns in pandas dataframe, one with keys second with values, where both are list of lists.

Like this:

import pandas as pd 
example = pd.DataFrame( {'col1': [['key1','key2','key3'],['key1','key4'],['key1', 'key3', 'key4','key5']], 'col2': [['value1','value2','value3'], ['value1','value4'], ['value1', 'value3', 'value4','value5']]  }) 
print(example)
    col1    col2
0   [key1, key2, key3]  [value1, value2, value3]
1   [key1, key4]    [value1, value4]
2   [key1, key3, key4, key5]    [value1, value3, value4, value5]

First i want to convert all possible keys to columns, the append values to them. Final result should look like this

    key1      key2    key3     key4    key5
0   value1    value2  value3   NaN     NaN
1   value1    NaN     NaN      value4  NaN
2   value1    NaN     value3   value4  value5
        

Solution

  • Try using explode and reshaping the dataframe.

    #pandas 1.3.0 update
    df_new = example.explode(['col1', 'col2'])
    #df_new = example.apply(pd.Series.explode)    
    df_new.set_index('col1', append=True).unstack()
    

    Output:

    col1    key1    key2    key3    key4    key5
    0     value1  value2  value3     NaN     NaN
    1     value1     NaN     NaN  value4     NaN
    2     value1     NaN  value3  value4  value5