Search code examples
pythonregexsubstitution

Regex replace: return empty none/empty string if no match


So i thought i know a bit of regex but it seems i found a case where my knowledge is at is end. Anyway i tried the following Regex Replace function: in cases of no match, $1 returns full line instead of null But the main difference is i want to not only replace the input with the match but also insert some characters inbetween the matches. Simply put i want to standardize the input to a certain pattern. The regex i want to match and capture specific parts of the input but not everything

^[\D]*(?P<from_day>(0?[1-9])|([12][0-9])|3[01])[\.\-\s,■]+(?P<from_month>(0?[1-9])|(1[0-2]))[\.\-\s,■]*(?P<until_day>(0?[1-9])|[12][0-9]|3[01])[\.\-\s,■]+(?P<until_month>(0?[1-9])|1[012])[\D]*$

the replacement string:

\g<from_day>.\g<from_month>-\g<until_day>.\g<until_month>

Input:

28.11 16.12
"13.01 23,09"
01.08.-31.12
"01.01,-51.12"
"01,01.-31,12."
01083112
1.02 - 4.3

Current output:

28.11-16.12.-.
13.01-23.09.-.
01.08-31.12.-.
.-..-.
01.01-31.12.-.
.-..-.
1.02-4.3.-.

Expected/desired:

28.11-16.12
13.01-23.09
01.08-31.12

01.01-31.12

1.02-4.3

https://regex101.com/r/M3arvW/1


Solution

  • You should change your regex to this:

    ^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+
    

    Updated RegEx Demo

    This will take care of all the issues except when there is no match. For no match you should use a lambda function re.sub to replace with an empty string.

    Python Code:

    >>> import re
    >>> arr = ['"01,01.-31,12."', '01083112', '1.02 - 4.3', '"01.01,-51.12"']
    >>> rx = re.compile(r'^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+')
    >>> for i in arr: print (rx.sub(lambda m: m.group('from_day') + '.' + m.group('from_month') + '-' + m.group('until_day') + '.' + m.group('until_month') if m.group('from_day') else '', i))
    ...
    01.01-31.12
    
    1.02-4.3