Search code examples
pythonpython-dateutil

Python Dateutil Parsing: Minimum number of components


The python dateutils package allows to parse date(time)s without specifying a format. It attempts to always return a date, even when the input does not appear to be one (e.g. 12). What would be a pythonic way to ensure at least a day, month and year component to be present in the input?

from dateutil import parser

dstr = '12'
dtime = parser.parse(dstr)

Returns 2019-06-12 00:00:00


Solution

  • One way you could do it is by splitting the input string on the likely date delimiters (e.g., ., -, :). So, this way you could input 2016.5.19 or 2016-5-19.

    from dateutil import parser
    import re
    
    def date_parser(thestring):
    
        pieces = re.split('\.|-|:', thestring)
    
        if len(pieces) < 3:
            raise Exception('Must have at least year, month and date passed')
    
        return parser.parse(thestring)
    
    print('---')
    thedate = date_parser('2019-6-12')
    print(thedate)
    
    print('---')
    thedate = date_parser('12')
    print(thedate)
    

    This will output:

    ---
    2019-06-12 00:00:00
    ---
    Traceback (most recent call last):
      File "bob.py", line 18, in <module>
        thedate = date_parser('12')
      File "bob.py", line 9, in date_parser
        raise Exception('Must have at least year, month and date passed')
    Exception: Must have at least year, month and date passed
    

    So the first one passes are there are 3 "pieces" to the date. The second one doesn't.

    This will get dodgy depending on what is in the re.split, one will have to make sure all the right delimiters are in there.

    You could remove the : in the delimiters if you want just typical date delimiters.