Search code examples
pythonpandastimezonepytz

In pandas, why does tz_convert change the timezone used from EST to LMT?


In the script below, Why are tz and tz2 are different?

import pandas
import pytz
tz = pytz.timezone('US/Eastern')
t = pandas.Timestamp('2014-03-03 08:05:39.216809')
tz2 = t.tz_localize(pytz.UTC).tz_convert(tz).tz

In this case, tz displays as:

<DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

But tz2 displays as:

<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

Shouldn't pandas honor the timezone I pass in to tz_convert? (Is this perhaps a known bug?)

Update:

This is more of a question about pytz it seems. The behavior that still confuses me (but likely has a clear explanation) is why are following different?

tz
<DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>

tz.localize(t).tzinfo
<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>

Solution

  • These are NOT the same.

    pytz.timezone(...) gives you the most recent timezone! (as of your pytz package date).

    Older version of pytz installed

    In [47]: pytz.__version__
    Out[47]: '2012j'
    
    In [48]: pytz.timezone('US/Eastern')
    Out[48]: <DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>
    

    Latest version installed

    In [2]: pytz.__version__
    Out[2]: '2014.4'
    
    In [3]: pytz.timezone('US/Eastern')
    Out[3]: <DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>
    

    Pandas handles this correctly, you can do it with a datetime directly like this

    pytz.timezone('US/Eastern').localize(datetime.datetime(2012,1,1))
    

    The timezone definition have recently changed to use LMT (local-mean-time). This doesn't matter as you when you localize to the dates that you are using they will be in the correct time zone.

    So in answer to your question, tz2 is correct as it localizes to a time zone that is correct for its date, while tz is 'correct' for the current date.