Search code examples
pandaslist-comprehension

Write value to new column if value in column is in a list in pandas


I am trying to use list comprehension for some complex column creation in pandas.

For instance, I am trying to use a list as a reference to create another column in a pandas data frame:

fruit  = ['watermelon', 'apple', 'grape']

string                     new_column
watermelons are cool      watermelon
apples are good           apple
oranges are on sale       NaN

I tried to use list comprehension -

df['new_column'] = [f in fruit if any(f in s for f in fruit) for s in df['string']]

I don't think this is correct, would need some help!


Solution

  • This will do the job:

    import pandas as pd 
    import numpy as np
    fruit  = ['watermelon', 'apple', 'grape']
    df = pd.DataFrame()
    df['string'] = ['watermelons are cool', 'apples are good', 'oranges are on sale', 'apples are not watermelons']
    
    output = df['string'].apply(lambda x: ','.join([f for f in fruit if f in x]))
    output[output == ''] = np.nan
    
    print(output)
    

    Output:

    0          watermelon
    1               apple
    2                 NaN
    3    watermelon,apple
    Name: string, dtype: object