Search code examples
pythonpandasdataframecontainsdata-analysis

How to map sevaral keywords with a dataframe column values using pandas in python


Hi I have a list of keywords.

keyword_list=['one','two']

DF,

    Name      Description
    Sri       Sri is one of the good singer in this two
    Ram       Ram is one of the good cricket player

I want to find the rows which are having all the values from my keyword_list.

my desired output is,

output_Df,
    Name    Description
    Sri     Sri is one of the good singer in this two

I tried, mask=DF['Description'].str.contains() method but I can do this only for a single word pls help.

Solution

  • Use np.logical_and + reduce of all masks created by list comprehension:

    keyword_list=['one','two']
    
    m = np.logical_and.reduce([df['Description'].str.contains(x) for x in keyword_list])
    df1 = df[m]
    print (df1)
    
      Name                                Description
    0  Sri  Sri is one of the good singer in this two
    

    Alternatives for mask:

    m = np.all([df['Description'].str.contains(x) for x in keyword_list], axis=0)
    
    #if no NaNs
    m = [set(x.split()) >= set(keyword_list) for x in df['Description']]