Search code examples
pandasdataframelistlambda

Pandas dataframe with a column containing lists : how to access elements "dynamically"


In my dataframe, I have one column (Vector) containing some list of elements :

import pandas as pd

x1 = ['A1','B1','C1']
x2 = ['A2','B2','C2','D2']
                  
df = pd.DataFrame([['ID_1',0,x1],['ID_2',2,x2]], columns=['ID','Key','Vector'])

print(df)

     ID  Key            Vector
0  ID_1    0      [A1, B1, C1]
1  ID_2    2  [A2, B2, C2, D2]

I would like to add a new column (Value) that contains the element of Vector at position given in (Key) :

     ID  Key            Vector Value
0  ID_1    0      [A1, B1, C1]    A1
1  ID_2    2  [A2, B2, C2, D2]    C2``

I can do it "statically", for instance df['Value'] = df['Vector'].apply(lambda x: x[1]) to get always the element located at index 1 of each list, but I don't know how to reuse similar syntax to get it dynamically (based on Key). Thanks in advance for any hints !


Solution

  • try

    df['Value'] = df.apply(lambda row: row['Vector'][row['Key']], 1)
    

    When you use .apply on an entire dataframe (df.apply) rather than a single column (df['Value'].apply), it feeds the function each row as a pandas series (when we specify axis = 1, otherwise it's each column), which lets you define functions that access multiple values in that row.