Search code examples
pythonpandassplitpandas-explode

Pandas explode/split lists and reasign values


I have this code

df = pd.DataFrame({'an':{0: 'f', 1: 'i', 2:'-' , 3:'-' , 4:'f' , 5:'c,f,i,j' , 6:'c,d,e,j'},
                   'nv':{0: [-1], 1: [-1], 2: ['-'], 3:['-'] , 4:[-2] , 5:[-2,-1,-3,-1] , 6:[-2,-1,-2,-1]},
                   })

which yields

    an         nv
0   f         [-1]
1   i         [-1]
2   -         [-]
3   -         [-]
4   f         [-2]
5   c,f,i,j   [-2, -1, -3, -1]
6   c,d,e,j   [-2, -1, -2, -1]

I would like to split column ['an'], so that each value inside ['an'] becomes column becomes a column with the value currently assigned on the bracket from ['nv'] column.

Desired output:

    an       nv                  c   d   e  f   i   j
0   f       [-1]                           -1       
1   i       [-1]                               -1   
2   -       [-]                     
3   -       [-]                     
4   f       [-2]                           -2       
5   c,f,i,j [-2, -1, -3, -1]    -2         -1  -3  -1
6   c,d,e,j [-2, -1, -2, -1]    -2  -1  -2         -1

Solution on this question is similar, but it does not work for my problem


Solution

  • Let's split the an column then explode the dataframe on multiple columns, then reshape with pivot:

    df.join(
        df
        .query("an != '-'")
        .assign(an=df['an'].str.split(','))
        .explode(['an', 'nv'])
        .pivot(columns='an', values='nv')
    ).fillna('')
    

            an                nv   c   d   e   f   i   j
    0        f              [-1]              -1        
    1        i              [-1]                  -1    
    2        -               [-]                        
    3        -               [-]                        
    4        f              [-2]              -2        
    5  c,f,i,j  [-2, -1, -3, -1]  -2          -1  -3  -1
    6  c,d,e,j  [-2, -1, -2, -1]  -2  -1  -2          -1