Search code examples
pythondatetimepython-3.6iso8601python-datetime

Python ISO 8601 datetime parsing


I have different types of ISO 8601 formatted date strings, using datetime library, i want to obtain a datetime object from these strings.

Example of the input strings:

  1. 2017-08-01 (1st august 2017)
  2. 2017-09 (september of 2017)
  3. 2017-W20 (20th week)
  4. 2017-W37-2 (tuesday of 37th week)

I am able to obtain the 1st, 2nd and 4th examples, but for 3rd, I get a traceback while trying.

I am using datetime.datetime.strptime function for them in try-except blocks, as follows:

try :
    d1 = datetime.datetime.strptime(date,'%Y-%m-%d')
except :
    try :
        d1 = datetime.datetime.strptime(date,'%Y-%m')
    except :
        try :
            d1 = datetime.datetime.strptime(date,'%G-W%V')
        except :
            print('Not going through')

When i tried the 3rd try block on terminal, here's the error i got

>>> dstr
'2017-W38'
>>> dt.strptime(dstr,'%G-W%V')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tushar.aggarwal\Desktop\Python\Python3632\lib\_strptime.py", line 565, in _strptime_datetime
    tt, fraction = _strptime(data_string, format)
  File "C:\Users\tushar.aggarwal\Desktop\Python\Python3632\lib\_strptime.py", line 483, in _strptime
    raise ValueError("ISO year directive '%G' must be used with "
ValueError: ISO year directive '%G' must be used with the ISO week directive '%V' and a weekday directive ('%A', '%a', '%w', or '%u').

And this is how i got the 4th case working :

>>> dstr
'2017-W38-2'
>>> dt.strptime(dstr,'%G-W%V-%u')
datetime.datetime(2017, 9, 19, 0, 0)

Here is the reference for my code : strptime documentation

There are many questions on SO regarding date parsing from ISO 8601 formats, but I couldn't find one addressing my issue. Moreover the questions involved are very old and take older versions of python where %G, %V directives in strptime are not available.


Solution

  • The pendulum library does a good job with these.

    >>> import pendulum
    >>> pendulum.parse('2017-08-01')
    <Pendulum [2017-08-01T00:00:00+00:00]>
    >>> pendulum.parse('2017-09')
    <Pendulum [2017-09-01T00:00:00+00:00]>
    >>> pendulum.parse('2017-W20')
    <Pendulum [2017-05-15T00:00:00+00:00]>
    >>> pendulum.parse('2017-W37-2')
    <Pendulum [2017-09-12T00:00:00+00:00]>
    

    The page to which I've referred you in the link says, 'The library natively supports the RFC 3339 format, most ISO 8601 formats and some other common formats. If you pass a non-standard or more complicated string, the library will fallback on the dateutil parser.'