Search code examples
pythonregexinfinite-loop

Python Regex re module running indefinitely


I am trying to replace every instance of Um with #Um. For example, "Um, i have an Umbrella" would be "#Um, i have an Umbrella" where I want Umbrella just as is because it's not just Um.

Below is my code.

while re.search(r'\bUm\b', trans):
            trans = re.sub(r'\bUm\b', r'#Um', trans)

And my code does not stop running. It loops indefinitely. Is there any other way to approach this problem?


Solution

  • After you do the replacement, the string still matches the regexp you're testing, because there's a word boundary between # and U in #Um. A word boundary is any place where there's a word character on one side and a non-word character on the other.

    So after you do the first replacement the string is

    #Um, i have an Umbrella
    

    The next iteration changes it to

    ##Um, i have an Umbrella
    

    and it keeps adding # over and over.

    Since re.sub() replaces all matches in the string (unless you use the optional argument that limits it), there's no need to do the same replacement in a loop. Get rid of the while statement and just use the call to re.sub() once.