I'm having a DataFrame with two columns. One column is filled with timestamps, the other column contains the offset in hours to UTC of the timestamp in the same row.
The DataFrame looks like this:
In [44]: df
Out[44]:
DATETIME OFFSET
0 2013-01-01 00:00:00+00:00 1
1 2013-01-01 01:00:00+00:00 1
2 2013-01-01 02:00:00+00:00 1
3 2013-01-01 03:00:00+00:00 1
4 2013-01-01 04:00:00+00:00 1
5 2013-01-01 05:00:00+00:00 1
6 2013-01-01 06:00:00+00:00 2
7 2013-01-01 07:00:00+00:00 2
8 2013-01-01 08:00:00+00:00 2
What i like to achieve is to add the offset per row to the timestamp:
In [44]: df
Out[44]:
DATETIME OFFSET
0 2013-01-01 00:00:00+01:00 1
1 2013-01-01 01:00:00+01:00 1
2 2013-01-01 02:00:00+01:00 1
3 2013-01-01 03:00:00+01:00 1
4 2013-01-01 04:00:00+01:00 1
5 2013-01-01 05:00:00+01:00 1
6 2013-01-01 06:00:00+02:00 2
7 2013-01-01 07:00:00+02:00 2
8 2013-01-01 08:00:00+02:00 2
I've tried with to replace tzinfo but failed to find a proper solution. I'm thinking about something like the following (pseudo code):
df.apply(lambda x: x['DATETIME'].replace(tzinfo=pytz.utc + x['OFFSET'])
Any help is appreciated.
Thanks, Thomas
It looks like pytz.FixedOffset
fits this purpose.
In [39]: df.apply(lambda x: pd.Timestamp(x['DATETIME'], tz=pytz.FixedOffset(60*x['OFFSET'])), axis=1)
Out[39]:
0 2013-01-01 00:00:00+01:00
1 2013-01-01 01:00:00+01:00
2 2013-01-01 02:00:00+01:00
3 2013-01-01 03:00:00+01:00
4 2013-01-01 04:00:00+01:00
5 2013-01-01 05:00:00+01:00
6 2013-01-01 06:00:00+02:00
7 2013-01-01 07:00:00+02:00
8 2013-01-01 08:00:00+02:00
dtype: object
Others around here use time series more than I do, so this may not be best practice.