Search code examples
pythonpython-3.xdjangoregexregex-group

Regex: how to extract incomplete date and convert


How can I get a date that may not contain the month or the day?

Right now I only know how to break the date

date_n= '2022-12'
match_year = re.search(r'(?P<year_only>(?P<year>\d+))', date_n)
match_month = re.search(r'(?P<whole_date>(?P<year>\d+)-(?P<month>\d+))', date_n)
match_day = re.search(r'(?P<whole_date>(?P<year>\d+)-(?P<month>\d+)-(?P<day>\d+))', date_n)

year = match_year.group('year_only')
month = match_month.group('month')
day = match_day.group('day')

Try and Except does not work.


Solution

  • You should build patterns for each year, month and day and match them all in one expression (making month and day optional):

    import re
    
    date_n= '2022-12'
    year_pattern = r"(?P<year>\d{4})"
    month_pattern = r"(?:-(?P<month>\d{1,2}))"
    day_pattern = r"(?:-(?P<day>\d{1,2}))"
    match_date = re.search(rf'{year_pattern}(?:{month_pattern}{day_pattern}?)?', date_n)
    
    year = match_date.group('year')
    month = match_date.group('month')
    day = match_date.group('day')