I'm trying to write a script which takes in a date and returns a pandas dataframe containing zeroes with localized datetimes in half hour periods for the index. SP refers to settlement period which is half an hour. The scripts works really well on most days but when it changes to or from daylight savings time I get:
AmbiguousTimeError: Cannot infer dst time from '2017-10-29 01:00:00', try using the 'ambiguous' argument
This happens at the line:
df_datetime.at[datetime, "Generation"] = 0
The date it tries to convert is '2017-10-29 01:00:00' even though it is given '2017-10-29 01:00:00 + 1:00:00', would I have to convert to UTC=0, convert in pandas then localize again? the 29th October 2017 is the day that daylight savings time finished in the UK.
The full script is here below:
import pandas as pd
from datetime import datetime, timedelta
import pytz
def SP_to_time_delta(SP):
dec_hour = (SP - 1)/2
hour = int(dec_hour)
if abs(int(dec_hour)-dec_hour) == 0.5:
minute = 30
else:
minute = 0
SP_timedelta = timedelta(hours=hour, minutes=minute)
return SP_timedelta
def localize_datetime_UK(date, SP):
pytz_tz = pytz.timezone('Europe/London')
local_date = pytz_tz.localize(date)
SP_timedelta = SP_to_time_delta(SP)
local_datetime = local_date+SP_timedelta
return local_datetime
def get_datetime_df(SettlementDate, max_SP):
df_datetime = pd.DataFrame([])
for i in range(max_SP+1):
datetime = localize_datetime_UK(SettlementDate, i)
df_datetime.at[datetime, "Generation"] = 0
df_datetime = df_datetime.sort_index()
return df_datetime
SettlementDate = datetime(2017, 10, 29)
df_datetime = get_datetime_df(SettlementDate, 50)
What's the best way to solve this?
Thanks for all replies and help!
Your problem originates due to a mixup in daylight saving time. Consider the following:
import pandas as pd
from datetime import datetime, timedelta
import pytz
pytz_tz = pytz.timezone('Europe/London')
datetime_1 = pytz_tz.localize(datetime(2017, 10, 29) \
+ timedelta(hours=0, minutes=0))
datetime_2 = pytz_tz.localize(datetime(2017, 10, 29) \
+ timedelta(hours=1, minutes=0))
print(datetime_1)
print(datetime_2)
> 2017-10-29 00:00:00+01:00
> 2017-10-29 01:00:00+00:00
As you can see, you're handling two times the "same" point in time but with different encodings.
One solution could be to convert all times to UTC (which has no dst) and only convert the times back when needed for output.
However, I think that you are adding the timedelta at the wrong place. Consider
datetime_1 = pytz_tz.localize(datetime(2017, 10, 29)) \ # localize first
+ timedelta(hours=0) # then add the delta
datetime_2 = pytz_tz.localize(datetime(2017, 10, 29)) \ # localize first
+ timedelta(hours=1) # then add the delta
> 2017-10-29 00:00:00+01:00
> 2017-10-29 01:00:00+01:00
which yields unique results. Check the semantics of your application which version applies.
===== old answer ======
You have to make the datetime object timezone aware using
from datetime import datetime, timedelta
from pytz import timezone
import pytz
eastern = timezone('US/Eastern')
SelltementDate = eastern.localize(datetime(2017, 10, 29, 0, 0, 0))
Refer also to the documentation of pytz.