I want the regex
machine to look for a certain pattern, and then only replace a subset of that pattern. The strings look like this:
string1 = 'r|gw|gwe|bbbss|gwe | s'
And, I want to replace some of the strings using a regex
like this:
re.sub('\|(gw.*)\|','nn',string1)
So, I want to look for the stuff between the |
's, but I only want replace what's between them, and not the entire |(gw.*)|
.
Is there a concise way to do this?
If you want to retain the pipe characters and match overlapping context, you need to use lookaround assertions. Because *
is a greedy operator, it will consume as much as possible.
In this case you can use a negated character class or *?
to prevent greediness.
>>> re.sub(r'(?<=\|)gw[^|]*(?=\|)', 'nn', s)
'r|nn|nn|bbbss|nn| s'
Or you could take a more general approach perhaps:
>>> '|'.join(['nn' if i.startswith('gw') else i for i in s.split('|')])
'r|nn|nn|bbbss|nn| s'