I'm not good with terms used in regex, so picking a suitable title for this question is some how difficult for me, so feel free the suggest a good title.
but anyway I have this txt and regex expression
import re
txt = """
%power {s}
shop %power {w}
%electricity {t}
""".replace('\n',' ')
x = re.findall("((\%?\w+\s)*)\{([^\}]*)\}",txt)
the result is
[('%power ', '%power ', 's'), ('shop %power ', '%power ', 'w'), ('%electricity ', '%electricity ', 't')]
but I was intended to get
[('%power ', 's'), ('shop ', '%power ', 'w'), ('%electricity ', 't')]
so how can I achieve the desired?
You need to pip install regex
and then use
import regex
txt = """
%power {s}
shop %power ow {w}
%electricity {t}
""".replace('\n',' ')
x = regex.finditer(r"(%?\w+\s)*\{([^{}]*)}", txt)
z = [tuple(y.captures(1) + y.captures(2)) for y in x]
print(z)
See the Python demo.
Output:
[('%power ', 's'), ('shop ', '%power ', 'ow ', 'w'), ('%electricity ', 't')]
NOTE on regex.finditer
usage
The regex.finditer
method returns an iterable, not a list. It has an implication that you cannot re-use the x
inside a list comprehension.
In order to re-use the contents of x
, either convert it to a list (list(x)
), or use the approach above, use it only once to get the necessary output structure, and do whatever you need with the result.