Search code examples
pythonregexpython-re

How to loop over regex matches and do replacement, without using a separate replacement function?


I need to replace every pattern like: {foo} by FOO + an increasing number, and also do_something_else(...) for each match. Example :

'hell{o} this {is} a t{est}' => hellO1 this IS2 a tEST3

How to do it without using a replacement function, but just with a loop over matches? I'm looking for something like:

import re

def do_something_else(x, y):  # dummy function
    return None, None

def main(s):
    i = 0
    a, b = 0, 0
    for m in re.findall(r"{([^{}]+)}", s):  # loop over matches, can we
        i += 1                              # do the replacement DIRECTLY IN THIS LOOP?
        new = m.upper() + str(i)
        print(new)
        s = s.replace('{' + m + '}', new)    # BAD here because: 1) s.replace is not ok! bug if "m" is here mutliple times   
                                             #                   2) modifying s while looping on f(.., s) is probably not good
        a, b = do_something_else(a, b)
    return s

main('hell{o} this {is} a t{est}')    # hellO1 this IS2 a tEST3

The following code (with a replacement function) works but the use of global variables is a big problem here here because in fact do_something_else() can take a few milliseconds, and this process might be mixed with another concurrent run of main() :

import re

def replace(m):
    global i, a, b
    a, b = do_something_else(a, b)
    i += 1
    return m.group(1).upper() + str(i)

def main(s):
    global i, a, b
    i = 0
    a, b = 0, 0
    return re.sub(r"{([^{}]+)}", replace, s)

main('hell{o} this {is} a t{est}')

Solution

  • Use finditer. Example:

    import re
    s = 'hell{o} this {is} a t{est}'
    counter = 1
    newstring = ''
    start = 0
    for m in re.finditer(r"{([^{}]+)}", s):
        end, newstart = m.span()
        newstring += s[start:end]
        rep = m.group(1).upper() + str(counter)
        newstring += rep
        start = newstart
        counter += 1
    newstring += s[start:]
    print(newstring)  # hellO1 this IS2 a tEST3