Search code examples
pythonpandasdataframecontains

str.contain "\" should only be used as an escape character outside of raw strings


I have a dataframe with one column and i want to know if the value of the column contain a "+". I made like this:

mask = df['column'].str.contains("\+")

But when I execute the sonarqube analysis throws me a bug ("" should only be used as an escape character outside of raw strings).

I tried to change the line of code to this:

mask = df['column'].str.contains(r"+")

But throws me an error:

re.error: nothing to repeat at position 0

How can i make the same as the first line without the error??


Solution

  • There is a difference between escaping the characters in python to be interpreted as a special character (e.g. \n is a newline and has nothing to do with n), and escaping the character not to be interpreted as a special symbol in the regex. Here you need both.

    Either use a raw string:

    mask = df['column'].str.contains(r'\+')
    

    or escape the backslash:

    mask = df['column'].str.contains('\\+')
    

    Example:

    df = pd.DataFrame({'column': ['abc', 'ab+c']})
    
    df['column'].str.contains(r'\+')
    
    0    False
    1     True
    Name: column, dtype: bool