I have a pandas dataframe (named s) and one of the columns (date) is a TimeStamp
s.date[0]
Out[126]:
Timestamp('2014-01-28 00:52:00-0500', tz='dateutil//usr/share/zoneinfo/America/New_York')
At some point in the code I need to select a subset of s (using idx, a list of booleans). The output is:
s.date[idx]
Out[125]:
1019 2014-12-01 00:52:00-05:00
1020 2014-12-01 01:52:00-05:00
1021 2014-12-01 02:52:00-05:00
Name: date, dtype: datetime64[ns, tzfile('/usr/share/zoneinfo/America/New_York')]
Since I'm only interested in the hour, I thought I could just do:
s.date.hour
but of course, I get the error
AttributeError: 'Series' object has no attribute 'hour'
Thinking that one can do:
s.date[0].hour
Out[128]: 0
I said, let me use a lambda to apply the .hour to every "row". Thus:
s.date[idx].apply(lambda x: x.hour)
Out[129]:
1019 5
1020 6
1021 7
As you can see, I am not getting the time in "Eastern Time", but rather in UTC.
I've done some searching online, but nothing... Is there a way to get the non-UTC hour?
Thanks!
Using Pandas 0.16.2, I didn't have a problem getting local US Eastern time from tz aware timestamps.
s = pd.Series(pd.date_range('20130101 09:10:12', periods=4, tz='US/Eastern', freq='H'))
>>> s
0 2013-01-01 09:10:12-05:00
1 2013-01-01 10:10:12-05:00
2 2013-01-01 11:10:12-05:00
3 2013-01-01 12:10:12-05:00
dtype: object
>>> s.dt.hour
0 9
1 10
2 11
3 12
dtype: int64
It also worked fine with indexing.
idx = [1, 3]
>>> s.ix[idx].dt.hour
1 10
3 12
dtype: int64