I want to calculate a DST aware figure for the total hours in a period localized for Europe/London.
Given the start time, I need to generate the end time and then calculate the hours in period.
For example:
MONTHLY PERIOD
# define start
s = pd.to_datetime('2023-03-01').tz.localize('Europe/London')
>>> Timestamp('2023-03-01 00:00:00+0000', tz='Europe/London')
# generate end
e = (s + pd.offsets.MonthEnd()) + pd.Timedelta(days=1)
>>> Timestamp('2023-04-01 01:00:00+0100', tz='Europe/London')
# calculate hrs in period
(e - s) / pd.Timedelta(hours = 1)
>>> 743.0
This seems accurate and provides the correct result as, in the UK, an hour is lost in March.
However, when changing the year to 2024 and setting s = pd.to_datetime('2024-03-01').tz.localize('Europe/London')
, an incorrect result of 744
is returned.
I'd like a full proof way of calculating hours in period please.
Using pd.offsets.MonthEnd()
sets the the end date to 2024/03/31 00:00:00 standard time and then pd.Timedelta(days=1)
adds 24 hours. It's really 23 hours to the beginning of the next month in 2024 due to the DST change on 3/31 2am. The calculation is correct in 2023 because pd.offsets.MonthEnd()
already has accounted for DST that occurred on 2023/03/26. Using pd.offsets.MonthBegin()
does it correctly for both years.
import pandas as pd
# define start
s = pd.to_datetime('2024-03-01').tz_localize('Europe/London')
print(s)
e = (s + pd.offsets.MonthEnd()) + pd.Timedelta(days=1)
print(e, 'adds 24 hours, note 1am due to DST change')
# calculate hrs in period
hours = (e - s) / pd.Timedelta(hours=1)
print(hours)
# define start
s = pd.to_datetime('2024-03-01').tz_localize('Europe/London')
print(s)
e = s + pd.offsets.MonthBegin()
print(e, 'correctly shifts to DST')
# calculate hrs in period
hours = (e - s) / pd.Timedelta(hours=1)
print(hours)
Output:
2024-03-01 00:00:00+00:00
2024-04-01 01:00:00+01:00 adds 24 hours, note 1am due to DST change
744.0
2024-03-01 00:00:00+00:00
2024-04-01 00:00:00+01:00 correctly shifts to DST
743.0