Search code examples
pythonpandasstringdataframecontains

How to check a substring in dataframe included in a long string variable?


My question is a different way as we usually check string by using str.contains(). I want to check if a substring in the dataframe is contained in a long string variable.

The dataframe looks like this:

Account Substring Category
1001 Cash Payment Category #1
1002 Credit Card Payment Category #2

The long string variable is long_str = “Cash Payment by Customer”.

So when using .loc to search/filter records in dataframe tha the substring that is contained in the long_str, is there any similar function like str.contains() but in the opposite way?

Below is the code I want to try to filter the dataframe, except str.contains() that won’t work. Thanks!

df.loc[df[‘Substring’].str.contains(long_str)]


Solution

  • You can simply use pandas.Series.apply method for that:

    >>> long_str = "Cash Payment by Customer"
    >>> df.loc[df.Substring.apply(lambda x: x in long_str)]
       Account     Substring     Category
    0     1001  Cash Payment  Category #1