I'm encountering an issue with reading timestamps properly, without any automatic conversions on Presto on EMR.
Example: within the AWS Glue catalog, i have a table with timestamp columns in UTC time (data type timestamp). When querying in Athena, they return as expected. When querying in Presto on EMR (EMR 5.26, Presto 0.220), there is an automatic conversion happening to a different time zone.
Presto docs describe a method of disabling this behavior here - https://prestosql.io/docs/current/language/timestamp.
The legacy semantics can be enabled using the deprecated.legacy-timestamp config property. Setting it to true (the default) enables the legacy semantics, whereas setting it to false enables the new semantics.
They outline their result differences with this option set to true vs false at the bottom
Query: SELECT TIME '10:00:00 Asia/Kathmandu' AT TIME ZONE 'UTC'
Legacy result: 04:30:00.000 UTC
New result: 04:15:00.000 UTC
After including deprecated.legacy-timestamp
set to true
in my EMR config (within presto-config), I'm still getting the new result according to this test query, (and my UTC timestamps are still being auto converted).
Any suggestions on what else i need to do to enable the legacy timestamp behavior?
Legacy timestamp behavior is still the default, you can track current state at https://github.com/prestosql/presto/issues/37. Apparently Athena evaluates timestamps as Presto would do when run with UTC session zone.
Since Presto 317 you can force client session zone with a config property:
sql.forced-session-time-zone=UTC
For all Presto versions, you can set client session zone. How to do this depends on the particular client in use. For example, with presto-cli
you would typically do
java -Duser.timezone=UTC -jar presto-cli.jar