Search code examples
pythondatetimestrptime

How to convert string with UTC offset


I have date as

In [1]: a = "Sun 10 May 2015 13:34:36 -0700"

When I try to convert it using strptime, its giving error.

In [3]: datetime.strptime(a, "%a %d %b %Y %H:%M:%S %Z"
   ...: )
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-973ef1c6daca> in <module>()
----> 1 datetime.strptime(a, "%a %d %b %Y %H:%M:%S %Z"
      2 )
/usr/lib/python2.7/_strptime.pyc in _strptime(data_string, format)
    323     if not found:
    324         raise ValueError("time data %r does not match format %r" %
--> 325                          (data_string, format))
    326     if len(data_string) != found.end():
    327         raise ValueError("unconverted data remains: %s" %
ValueError: time data 'Sun 10 May 2015 13:34:36 -0700' does not match format '%a %d %b %Y %H:%M:%S %Z'
In [6]: datetime.strptime(a, "%a %d %b %Y %H:%M:%S %z")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-e4870e34edda> in <module>()
----> 1 datetime.strptime(a, "%a %d %b %Y %H:%M:%S %z")
/usr/lib/python2.7/_strptime.pyc in _strptime(data_string, format)
    315                 del err
    316                 raise ValueError("'%s' is a bad directive in format '%s'" %
--> 317                                     (bad_directive, format))
    318             # IndexError only occurs when the format string is "%"
    319             except IndexError:
ValueError: 'z' is a bad directive in format '%a %d %b %Y %H:%M:%S %z'

As per doc, correct format is %z, but I might missing some part.


Solution

  • Your format string is correct and works fine in Python 3.3:

    >>> a = "Sun 10 May 2015 13:34:36 -0700"
    >>> datetime.strptime(a, "%a %d %b %Y %H:%M:%S %z")
    datetime.datetime(2015, 5, 10, 13, 34, 36, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200)))
    

    It gives the error in Python 2.7 indeed.

    Unlike strftime(), which is implemented by calling the libc function, strptime() is implemented in the Python library. Here you can see that the version used in Python 2.7 doesn’t support the z format. On the other hand here is the version from Python 3.3, which supports that (I think this was added around 3.2).

    So, basically, you have two options:

    1. Using some external library that is able to handle z.
    2. Implementing it yourself (e.g. by stripping the timezone from the string, feeding the first part to strptime() and parsing the second one manually). Looking at how this is done in the Python library might be helpful.

    I tried to parse this to return an “aware” object, but it is somewhat complicated.

    >>> a = "Sun 10 May 2015 13:34:36 -0700"
    >>> time, tz = a.rsplit(' ', 1)
    >>> d = datetime.strptime(time, '%a %d %b %Y %H:%M:%S')
    datetime.datetime(2015, 5, 10, 13, 34, 36)
    

    Now I have to call d.replace(tzinfo=…tz…) to replace the timezone, but the problem is that I can’t get an instance of tzinfo because just knowing the offset from UTC is not enough to identify a timezone.

    In Python 3.2 there is a special timezone class that is a subclass of tzinfo representing a “fake” timezone defined by just its offset. So there are two ways to proceed:

    1. Backport (basically, copy and paste) the timezone class from Python 3 and use it in your parser.
    2. Return a “naive” object:

      >>> d + timedelta(hours=int(tz[1:]) * (1 if tz.startswith('-') else -1))
      datetime.datetime(2015, 6, 8, 17, 34, 36)