Search code examples
pythondataframeconditional-statementsdrop

How to drop certain rows from dataframe if they partially meet certain condition


I'm trying to drop rows from dataframe if they 'partially' meet certain condition.

By 'partially' I mean some (not all) values in the cell meet the condition.

Lets' say that I have this dataframe.

>>> df    
    Title                                  Body
0   Monday report: Stock market            You should consider buying this.
1   Tuesday report: Equity                 XX happened. 
2   Corrections and clarifications         I'm sorry.
3   Today's top news                       Yes, it skyrocketed as I predicted.

I want to remove the entire row if the Title has "Monday report:" or "Tuesday report:".

One thing to note is that I used

TITLE = []
.... several lines of codes to crawl the titles.
TITLE.append(headline)

to crawl and store them into dataframe.

Another thing is that my data are in tuples because I used

df = pd.DataFrame(list(zip(TITLE, BODY)), columns =['Title', 'Body'])

to make the dataframe.

I think that's why when I used,

df.query("'Title'.str.contains('Monday report:')")

I got an error.

When I did some googling here in StackOverflow, some advised to convert tuples into multi-index and to use filter(), drop(), or isin().

None of them worked.

Or maybe I used them in a wrong way...?

Any idea to solve this prob?


Solution

  • you can do a basic filter for a condition and then pick reverse of it using ~:

    eg: df[~df['Title'].str.contains('Monday report')] will give you output that excludes all rows that contain 'Monday report' in title.