Example:
import regex
import itertools
m = "90.80.19 90.43.19 908019 92.11.15 90.80.19 930000"
reg = regex.compile("\d\d\.?\d\d\.?\d\d")
[list(g) for k, g in itertools.groupby(sorted(reg.findall(m)))]
Output: [['90.43.19'], ['90.80.19', '90.80.19'], ['908019'], ['92.11.15'], ['930000']]
groupby() groups doubles: only the double 90.80.19
has been grouped.
What I want to do is to group by above regex: The \.?
is optional in above regex.
Expected output: [['90.43.19'], ['90.80.19', '90.80.19', '908019'], ['92.11.15'], ['930000']]
Is it possible to let groupby() group with a condition?
Use a custom key
function for itertools.groupby(iterable, key=None)
as shown below (the initial input string was extended):
import re, itertools
s = "90.80.19 90.43.19 908019 92.11.15 90.80.19 930000 921115"
matches = re.findall(r'\d\d\.?\d\d\.?\d\d', s)
result = [ list(g) for k,g in itertools.groupby(sorted(matches),
key=lambda x: x.replace('.', '') or x) ]
print(result)
The output:
[['90.43.19'], ['90.80.19', '90.80.19', '908019'], ['92.11.15', '921115'], ['930000']]