Search code examples
pythonpandasdataframedata-cleaningdata-wrangling

How to subset a dataframe that with a condition on two columns


I have a dataframe and I'm trying to get a subset of it where one column is contained in a list and the other column contains a word.

Context: enter image description here

The table above is a sample of the dataframe I am working with, I am trying to subset the dataframe based on names that are contained in a list. I also want to subset it further by getting texts that contained 'named'

A sample of the code I use is:

names = ['a', 'an', 'my', 'by', 'mad', 'very', 'just', 'quite', 'one', 'actually', 'life', 'light', 'officially','his', 'old', 'this', 'all','the']
archive[archive['name'].isin(names)]['text'].str.contains('named')

But the code above returns a boolean series. I am trying to get a dataframe.


Solution

  • Chain both conditions with & for bitwise AND:

    archive[archive['name'].isin(names) & archive['text'].str.contains('named')]