Search code examples
pythonpandasdataframesearchcontains

How to use str.contains to identify a word within column values in Python?


currently in my data I have a column that contains description of transaction. I want to use str.contains to identify which values/rows are AW (the fast food store) transaction. However, when I use data['cat_desc'].str.contains('AW', case=False, na=False), it also identifies values that have string 'aw', for example 'awxxxx' but I don't want that. How can I just identify 'AW' as a word and not string? Thanks!


Solution

  • Then use a regex with word boundaries ('\b'):

    data['cat_desc'].str.contains(r'\bAW\b', case=False, na=False, regex=True)
    

    NB. By default contains uses regex=True.