So i thought i know a bit of regex but it seems i found a case where my knowledge is at is end. Anyway i tried the following Regex Replace function: in cases of no match, $1 returns full line instead of null But the main difference is i want to not only replace the input with the match but also insert some characters inbetween the matches. Simply put i want to standardize the input to a certain pattern. The regex i want to match and capture specific parts of the input but not everything
^[\D]*(?P<from_day>(0?[1-9])|([12][0-9])|3[01])[\.\-\s,■]+(?P<from_month>(0?[1-9])|(1[0-2]))[\.\-\s,■]*(?P<until_day>(0?[1-9])|[12][0-9]|3[01])[\.\-\s,■]+(?P<until_month>(0?[1-9])|1[012])[\D]*$
the replacement string:
\g<from_day>.\g<from_month>-\g<until_day>.\g<until_month>
Input:
28.11 16.12
"13.01 23,09"
01.08.-31.12
"01.01,-51.12"
"01,01.-31,12."
01083112
1.02 - 4.3
Current output:
28.11-16.12.-.
13.01-23.09.-.
01.08-31.12.-.
.-..-.
01.01-31.12.-.
.-..-.
1.02-4.3.-.
Expected/desired:
28.11-16.12
13.01-23.09
01.08-31.12
01.01-31.12
1.02-4.3
You should change your regex to this:
^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+
This will take care of all the issues except when there is no match. For no match you should use a lambda
function re.sub
to replace with an empty string.
Python Code:
>>> import re
>>> arr = ['"01,01.-31,12."', '01083112', '1.02 - 4.3', '"01.01,-51.12"']
>>> rx = re.compile(r'^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+')
>>> for i in arr: print (rx.sub(lambda m: m.group('from_day') + '.' + m.group('from_month') + '-' + m.group('until_day') + '.' + m.group('until_month') if m.group('from_day') else '', i))
...
01.01-31.12
1.02-4.3