Search code examples
pythonpandasstringcontains

how to search for caret (^) in string


I have a pandas dataframe with a bunch of strings. Some of the strings contain a caret (ie. a ^ symbol).

I am trying to remove them using this:

df['text'] = df[df['text'].str.contains('^') == False]

I don't get an error but it is finding a caret in every row which is not correct. Is there something special about that symbol?


Solution

  • Per the comments, you must escape the caret or disable the default regex processing:

    >>> import pandas as pd
    >>> df = pd.DataFrame({'text':['abc','d^e','fgh']})
    >>> df
      text
    0  abc
    1  d^e
    2  fgh
    >>> df[df.text.str.contains('^', regex=False) == False]
      text
    0  abc
    2  fgh
    >>> df[df.text.str.contains('\^') == False]
      text
    0  abc
    2  fgh
    

    Note, while df.text.str.contains('\^') == False works, it's customary to invert the Boolean with ~.

    df[~df.text.str.contains('\^')]