I've been reading about best practices for handling datetimes in Python and for storing them into postgresql (using utc as much as I can, with pytz for conversions, avoiding the tzinfo parameter in datetime constructor, etc...).
But my doubt now is that I'm surprised not finding anything about the datetime.time object and its best practices.
For the sake of example, imagine I want to store only a time, like 20:30, because I'm scheduling some task every week a couple of days on that time, but the day in the week could change every week. And probably the user has entered the time in his/her timezone. In my situation it would be a user in the Spanish timezone 'Europe/Madrid'.
My questions are:
1) Once I get the time as a datetime.time, how do I should store the timezone info into a datetime.time variable. Is it ok to use
datetime.time(h, m, s, ms, tzinfo=pytz_spanish_timezone)
???
2) If not with the former
line of code, how do I localize properly a naive time? datetime.datetime uses
my_datetime.localize(pytz_spanish_timezone)
3) How do I convert one datetime.time object from a timezone to another, given that with datetime and pytz it would be using
new_tz_datetime = my_datetime.astimezone(pytz_spanish_timezone)
but with time there is no similar method
4) How should I store the datetime.time in postgresql database? I know there are time and timetz data types. I suppose I should store the time as UTC. Would the timezone matter? Should I store it somehow?
2) [H]ow do I localize properly a naive time?
datetime.datetime
usesmy_datetime.localize(pytz_spanish_timezone)
Actually, it is the other way around. localize
is a pytz timezone method, not a datetime
method:
import pytz
madrid = pytz.timezone('Europe/Madrid')
aware_datetime = madrid.localize(naive_datetime)
You need a datetime.datetime
here. There is no equivalent for datetime.time
objects. See below for the reason why.
3) How do I convert one
datetime.time
object from a timezone to another?
Consider the following situation: We know the time is 20:30 and the timezone is Europe/Madrid
and we wish to convert it to UTC.
The result will be different depending on whether or not the date falls within Daylight Savings Time (CEST) or not (CET):
For example,
import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
utc = pytz.utc
CET_date = madrid.localize(DT.datetime(2019, 3, 30, 20, 30, 0), is_dst=None)
# the most recent transition occurred at `2019-03-31 02:00:00+01:00 CEST`
CEST_date = madrid.localize(DT.datetime(2019, 3, 31, 20, 30, 0), is_dst=None)
print(CET_date.astimezone(utc))
print(CEST_date.astimezone(utc))
# 2019-03-30 19:30:00+00:00
# 2019-03-31 18:30:00+00:00
Notice that when the date is in CET, the time 20:30 is "converted" to 19:30, but when the date is in CEST, the time is converted to 18:30. There is no (simple) answer to your question without first knowing the date.
4a) How should I store the
datetime.time
in postgresql database? I know there are time and timetz data types.
Per the docs:
The type
time with time zone
is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness.
I think the docs are alluding to the issue shown above. Don't use time with
time zone
. If you want to store a time, use the PostgreSQL plain time
type.
You could store the time
and the timezone
in the database, and reconstitute
a timezone-aware datetime later once you have a date. But note that there are
pitfalls:
There are ambiguous local datetimes
import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=None)
raises pytz.exceptions.AmbiguousTimeError: 2019-10-27 02:00:00
.
To avoid the AmbiguousTimeError
, is_dst
must be specified explicitly:
import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=False)
print(date)
date = madrid.localize(DT.datetime(2019, 10, 27, 2, 0, 0), is_dst=True)
print(date)
# 2019-10-27 02:00:00+01:00
# 2019-10-27 02:00:00+02:00
There are nonexistent local datetimes
import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=None)
raises pytz.exceptions.NonExistentTimeError: 2019-03-31 02:00:00
You can avoid the NonExistentTimeError by specifying whether or not the naive localtime refers to a time during DST (Daylight Savings Time):
import datetime as DT
import pytz
madrid = pytz.timezone('Europe/Madrid')
date = madrid.normalize(madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=False))
print(date)
date = madrid.normalize(madrid.localize(DT.datetime(2019, 3, 31, 2, 0, 0), is_dst=True))
print(date)
# 2019-03-31 03:00:00+02:00
# 2019-03-31 01:00:00+01:00
There can be datetimes which are unrepresentable given a local datetime and a particular timezone.
The AmbiguousTimeError
and NonExistentTimeError
above show the importance of specifying the value of is_dst
.
To avoid these errors you would need to store a boolean is_dst
along with time
and timezone
in the database.
You might think that you could avoid the problem simply choosing one value of
is_dst
for all times. But you'd be mistaken. Here is a peculiar example
(taken from the pytz docs) which shows if you
always choose is_dst = False
(or is_dst = True
) there can be UTC datetimes
which can not be expressed given just a naive localtime and a timezone!
import datetime as DT
import pytz
warsaw = pytz.timezone('Europe/Warsaw')
utc = pytz.utc
date1 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 35, 59), is_dst=False).astimezone(utc)
date2 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 36, 0), is_dst=False).astimezone(utc)
print('Datetimes between {} and {} can not be expressed if we assume is_dist=False.'.format(date1, date2))
date3 = warsaw.localize(DT.datetime(1915, 8, 4, 23, 59, 59), is_dst=True).astimezone(utc)
date4 = warsaw.localize(DT.datetime(1915, 8, 5, 0, 0, 0), is_dst=True).astimezone(utc)
print('Datetimes between {} and {} can not be expressed if we assume is_dist=True.'.format(date1, date2))
prints
Datetimes between 1915-08-04 22:11:59+00:00 and 1915-08-04 22:36:00+00:00 can not be expressed if we assume is_dist=False.
Datetimes between 1915-08-04 22:11:59+00:00 and 1915-08-04 22:36:00+00:00 can not be expressed if we assume is_dist=True.
4b) I suppose I should store the time as UTC. Would the timezone matter? Should I store it somehow?
For the reasons shown above there is no such thing as a time (without a date) in UTC. But you can avoid the problems mentioned above by simplying storing datetimes in UTC.
If you create a table with a column with timestamptz
data type, then
you can use a database adapter such as psycopg2
to store Python timezone-aware datetimes
as PostgreSQL timestamptz
s. When you query the database, psycopg2
will convert the timestamptz
s back into
timezone-aware datetimes for you.
Internally, PostgreSQL stores all timestamptz
s in UTC, but it reports the values with respect to
the PostgreSQL user's timezone setting. On the Python side, given a timezone-aware datetime,
you can use its astimezone
method to convert it to any timezone you like.
You don't need to store the timezone separately unless you want to report different datetimes with respect to different timezones.
5) How to parse a time from a string without going through datetime?
You could use regex to parse time strings:
import re
import datetime as DT
atime = DT.time(*map(int, re.search(r'(\d{,2}):(\d{,2}):(\d{,2})', 'blueberry jam at 13:32:02').groups()))
print(repr(atime))
# datetime.time(13, 32, 2)
Above, the regex pattern \d
matches a single digit. \d{1,2}
matches 1 or 2 digits.
Alternatively, the 3rd-party dateutil package can parse time strings in a variety of formats:
import dateutil.parser as DP
print(DP.parse("13:32:02").time())
# 13:32:02
print(DP.parse("blueberry jam at 13:32:02", fuzzy=True).time())
# 13:32:02
print(DP.parse("30 minutes 12 hours").time())
# 12:30:00
print(DP.parse("2:30pm").time())
# 14:30:00
There is a lot to digest here, and there is probably more that could be said about each of these issues. In the future, you may want to split your post into multiple questions. This will lower the barrier for people who may wish to answer one question but not all of them, and will help you get more answers quicker.