Search code examples
pythonregexplaceholderregex-greedynon-greedy

Non greedy match of .* with ^


Given the string:

s = "Why did you foo bar a <b>^f('y')[f('x').get()]^? and ^f('barbar')^</b>"

How do I replace the ^f('y')[f('x').get()]^ and ^f('barbar')^ with a string, e.g. PLACEXHOLDER?

The desired output is:

Why did you foo bar a <b>PLACEXHOLDER? and PLACEXHOLDER</b>

I've tried re.sub('\^.*\^', 'PLACEXHOLDER', s) but the .* is greedy and it matches, ^f('y')[f('x').get()]^? and ^f('barbar')^ and outputs:

Why did you foo bar a PLACEXHOLDER

There can be multiple substrings of unknown number that's encoded by \^ so hardcoding this is not desired:

re.sub('(\^.+\^).*(\^.*\^)', 'PLACEXHOLDER', s)

Solution

  • If you add a question mark after the star, it will make it non-greedy.

    \^.*?\^
    

    http://www.regexpal.com/?fam=97647

    Why did you foo bar a <b>^f('y')[f('x').get()]^? and ^f('barbar')^</b>
    

    Properly replaces to

    Why did you foo bar a <b>PLACEXHOLDER? and PLACEXHOLDER</b>