We have one client table in Athena where creatied_date and updated_date are in the following format
created_date : 2023-03-19T00:19:47+00:00
updated_date : 2023-03-19T00:19:49+00:00
So for one particular order, I have 3 rows, with creation_date value same but updated_date value different.
If I want to pick the latest entry, I need to compare updated_date values.
I plan to convert them into epoch_utc and epoch_utc_tz (time-zone) format.
I am seeking an Athena SQL solution to convert updated_date into epoch_utc and epoch_utc_tz formats.
order_id created_date updated_date
123 2023-03-19T00:19:47+00:00 2023-03-19T00:19:49+00:00
123 2023-03-19T00:19:47+00:00 2023-03-20T01:34:21+00:00
123 2023-03-19T00:19:47+00:00 2023-03-19T01:09:39+00:00
If your dates all have this particular format and have the same offset you don't need to convert - just compare, for example:
-- sample data
WITH dataset(order_id, created_date, updated_date) AS (
values (123, '2023-03-19T00:19:47+00:00', '2023-03-19T00:19:49+00:00'),
(123, '2023-03-19T00:19:47+00:00', '2023-03-20T01:34:21+00:00'),
(123, '2023-03-19T00:19:47+00:00', '2023-03-19T01:09:39+00:00')
)
-- query
select order_id, created_date, max(updated_date) updated_date
from dataset
group by order_id, created_date;
Output:
order_id | created_date | updated_date
----------+---------------------------+---------------------------
123 | 2023-03-19T00:19:47+00:00 | 2023-03-20T01:34:21+00:00
If you need the date itself - just convert to date with from_iso8601_timestamp
(since the sample dates are in the corresponding format) and compare dates:
select order_id, created_date, max(from_iso8601_timestamp(updated_date))
from dataset
group by order_id, created_date;