Search code examples
pandasdataframedelete-row

Pandas: Delete a row if a string is contained in any of the multiple columns


I have a dataframe that looks like

import pandas as pd
   df = pd.DataFrame({'Subject1': ['Math', 'BioScience', 'PhysicalScience',
                                'Sociology', 'Psychology', 'Arts'],
                       'Subject2': ['BioScience', 'PhysicalScience', 'Sociology', 
                                   'Arts', 'Arts', 'PhysicalScience'],
                       'points': [10, 8, 10, 6, 6, 5]})

I would like to delete the rows where Subject1 or Subject2 column contains the string "Science".


Solution

  • here is one way to do it

    
    # choose rows where subject1 or subject2 contains 'science'
    # negate the result to choose rows that fails the match
    
    out=df.loc[~(df['Subject1'].str.contains('Science') | 
                 df['Subject2'].str.contains('Science'))]
    out
    
    Subject1    Subject2    points
    3   Sociology   Arts    6
    4   Psychology  Arts    6