Search code examples
python-3.xdatetimetimezoneutc

How to determine the appropriate the timezone to apply for historical dates in a give region in python3


I'm using python3 on Ubuntu 20.04.

I have a trove of files with naive datetime strings in them, dating back more than 20 years. I know that all of these datetimes are in the Pacific Timezone. I would like to convert them all to UTC datetimes.

However, whether they are relative to PDT or PST is a bigger question. Since when PDT/PST changes has changed over the last 20 years, it's not just a matter of doing a simple date/month threshold to figure out whether to apply the pdt or pst timezone. Is there an elegant way to make this determination and apply it?


Solution

  • Note upfront, for Python 3.9+: use zoneinfo from the standard library, no need anymore for a third party library. Example.


    Here's what you can to do set the timezone and convert to UTC. dateutil will take DST changes from the IANA database.

    from datetime import datetime
    import dateutil
    
    datestrings = ['1991-04-06T00:00:00', # PST
                   '1991-04-07T04:00:00', # PDT
                   '1999-10-30T00:00:00', # PDT
                   '1999-10-31T02:01:00', # PST
                   '2012-03-11T00:00:00', # PST
                   '2012-03-11T02:00:00'] # PDT
    
    # to naive datetime objects
    dateobj = [datetime.fromisoformat(s) for s in datestrings]
    
    # set timezone:
    tz_pacific = dateutil.tz.gettz('US/Pacific')
    dtaware = [d.replace(tzinfo=tz_pacific) for d in dateobj] 
    # with pytz use localize() instead of replace
    
    # check if has DST:
    # for d in dtaware: print(d.dst())
    # 0:00:00
    # 1:00:00
    # 1:00:00
    # 0:00:00
    # 0:00:00
    # 1:00:00
    
    # convert to UTC:
    dtutc = [d.astimezone(dateutil.tz.UTC) for d in dtaware]
    
    # check output
    # for d in dtutc: print(d.isoformat())
    # 1991-04-06T08:00:00+00:00
    # 1991-04-07T11:00:00+00:00
    # 1999-10-30T07:00:00+00:00
    # 1999-10-31T10:01:00+00:00
    # 2012-03-11T08:00:00+00:00
    # 2012-03-11T09:00:00+00:00
    

    Now if you'd like to be absolutely sure that DST (PDT vs. PST) is set correctly, you'd have to setup test cases and verify against IANA I guess...