Search code examples
pythondataframepython-dateutil

Inconsistent ValueError with timezone offset out of bounds


The timezone offset part of a timestamp must be between -12 and +14 hours. Otherwise it's nonsensical. I'm working with a field of timestamp strings in a pandas dataframe ad some of my timestamps are nonsensical due to the offset being out of this -12 + 14 hour boundary.

A good timestamp:

good = '2019-11-11T07:08:09.640-4:00'

A bad timezone offset

bad = '2019-11-19T22:51:34.619000+17:00'

Another bad timezone offset:

bad2 = '2019-11-11T07:08:09.640-31:00'

Now, if I try to convert these strings to isoformat:

Works as expected:

import dateutil
dateutil.parser.parse(good).isoformat()
'2019-11-11T07:08:09.640000-04:00'

Does not work as expected, returns a timestamp:

dateutil.parser.parse(bad).isoformat()
'2019-11-19T22:51:34.619000+17:00'

Works as expected, I get an error message (Which I could subsequently use in a if else, try, catch block)

dateutil.parser.parse(bad2).isoformat()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
ValueError: offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24).

Why do I get a error message on bad2 and not bad when they both have out of bounds time zone offsets


Solution

  • This is simply the bounds of time zone offsets in Python - as noted in the error message, offsets are bounded to be at most ±24h, which is consistent with your findings. It is not related to the current maximum and minimum offsets in real time zones, other than the fact that it would be a problem if the bounds did not allow all real time zones to be represented.

    There is no simple way to get datetime or dateutil to fail in the way you want because the bounds are not configurable. If you want to detect offsets outside +14/-12 or whatever arbitrary restriction, you'll need to check utcoffset, like so:

    if not (timedelta(hours=-12) < dt.utcoffset() < timedelta(hours=14)):
        raise ValueError(...)
    

    That said, I don't recommend this course of action unless you know that some of your strings have this particular error mode. One thing you will learn dealing with datetimes and time zones is that putting arbitrary restrictions on them is rarely a good idea, because some random country somewhere will decide to make a rule that violates your neat "practical" restriction. I am even mildly wary of the ±24h offset restriction in tzinfo, but that is built into the language and it is at least unlikely that a real offset will violate it any time soon.