Search code examples
pythonpandaslistdataframelist-comprehension

how to delete rows that contain a word from a list in python


As stated in the title I have a pandas data frame with string sentences in the column "title". I know want to filter all rows, where the title column contains one of the words specified in the list "keywords".

keywords = ["Simon", "Mustermann"]

df =

Title Bla
Simon is a python beginner ...
Second balaola ...
Simon ...

Since "Simon" is found in rows with index 0 and 2, they should be retained.

My code atm is the following: new_df = df[df["title"].isin(keywords)] However, it only contains the third row but not the first one. How can I fix this? Thanks a lot for your support and time!


Solution

  • This snippet should work for you

    keywords = ["Simon", "Mustermann"]
    
    # filter rows where column title contains one of the keywords
    df_filtered = df[df["title"].str.contains("|".join(keywords))]