Search code examples
pythonpandasdataframepandas-groupby

Pandas Implode and Create 2 New Column based on list value


I have pandas data frame like this

Name   Col1
Foo    [A, C, B, D, E]
Bar    [G, M, O]
Baz    [E, B]

I want to change it to:

Name   New_Col1   New_Col2   
Foo    A          C
Foo    C          B
Foo    B          D
Foo    D          E
Bar    G          M
Bar    M          O
Baz    E          B

How can I do that?


Solution

  • Using explode and GroupBy.shift:

    s = df['Col1'].explode()
    df.drop(columns='Col1').join(
     pd.concat([s, s.groupby(level=0).shift(-1)], axis=1)
       .set_axis(['New_Col1', 'New_Col2'], axis=1)
       .dropna()
    )
    

    output:

      Name New_Col1 New_Col2
    0  Foo        A        C
    0  Foo        C        B
    0  Foo        B        D
    0  Foo        D        E
    1  Bar        G        M
    1  Bar        M        O
    2  Baz        E        B
    

    With the approach, you can easily generalize to more combinations, example with 3:

    s = df['Col1'].explode()
    
    n = 3
    
    df.drop(columns='Col1').join(
     pd.concat([s.groupby(level=0).shift(-x) for x in range(n)], axis=1)
       .set_axis([f'New_Col{i+1}' for i in range(n)], axis=1)
       .dropna()
    )
    

    output:

      Name New_Col1 New_Col2 New_Col3
    0  Foo        A        C        B
    0  Foo        C        B        D
    0  Foo        B        D        E
    1  Bar        G        M        O
    2  Baz      NaN      NaN      NaN