Let's say I have compiled five regular expression patterns and then created five Boolean variables:
a = re.search(first, mystr)
b = re.search(second, mystr)
c = re.search(third, mystr)
d = re.search(fourth, mystr)
e = re.search(fifth, mystr)
I want to use the Powerset of (a, b, c, d, e) in a function so it finds more specific matches first then falls through. As you can see, the Powerset (well, its list representation) should be sorted by # of elements descending.
Desired behavior:
if a and b and c and d and e:
return 'abcde'
if a and b and c and d:
return 'abcd'
[... and all the other 4-matches ]
[now the three-matches]
[now the two-matches]
[now the single matches]
return 'No Match' # did not match anything
Is there a way to utilize the Powerset programmatically and ideally, tersely, to get this function's behavior?
You could use the powerset()
generator function recipe in the itertools
documentation like this:
from itertools import chain, combinations
from pprint import pprint
import re
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
mystr = "abcdefghijklmnopqrstuvwxyz"
first = "a"
second = "B" # won't match, should be omitted from result
third = "c"
fourth = "d"
fifth = "e"
a = 'a' if re.search(first, mystr) else ''
b = 'b' if re.search(second, mystr) else ''
c = 'c' if re.search(third, mystr) else ''
d = 'd' if re.search(fourth, mystr) else ''
e = 'e' if re.search(fifth, mystr) else ''
elements = (elem for elem in [a, b, c, d, e] if elem is not '')
spec_ps = [''.join(item for item in group)
for group in sorted(powerset(elements), key=len, reverse=True)
if any(item for item in group)]
pprint(spec_ps)
Output:
['acde',
'acd',
'ace',
'ade',
'cde',
'ac',
'ad',
'ae',
'cd',
'ce',
'de',
'a',
'c',
'd',
'e']