I create dynamically a list of regex, namely regex_list
.
Each regex in the list does for sure at least one match on the text to which is applied.
It may happens that some regex in the list are equals.
regex_list = []
for f in foo: # foo is a list of strings e.g. foo = ['foo1', 'foo2', 'foo1', ...]
# f is a valid expression to be used inside the regex
regex_list.append(f'[^.]*?{f}[^.]*\.')
regex = re.compile('|'.join(regex_list), flags=re.DOTALL)
result = re.findall(regex, text)
Since
regex_list
may be equalsregex_list
are combined together with the OR operatorfor the regex for which exists another copy in the list, only the first match in the text is captured.
A workaround could be to apply each regex individually with a for-loop, but it is very slow.
Is there a good way to combine regex and make them match everything possible?
Casually discovered that applying each regex individually in a for-loop is very slow using the re module, while it's surprisingly faster using the regex module.