I am working with a dataset that has a column titled "birth_dt" from which I am trying to calculate the age of the member as of 18 months ago from.
This is what "select distinct(birth_dt) from table.DB limit 9;" shows the column to look like:
birth_dt
-----------------------
1990-08-15 00:00:00.000
1986-05-30 00:00:00.000
1981-03-03 00:00:00.000
1970-05-11 00:00:00.000
1963-01-20 00:00:00.000
1985-01-26 00:00:00.000
1983-01-03 00:00:00.000
2021-01-03 00:00:00.000
1995-11-14 00:00:00.000
In Athena, this column shows as being a "string".
So far, what I have tried is:
select DATE_DIFF('year', birth_dt, (current_timestamp - interval '18' month)) as age from table.DB limit 10;
which kept giving me an error saying:
"SYNTAX_ERROR: line 1:8: Unexpected parameters (varchar(4), varchar, timestamp with time zone) for function date_diff. Expected: date_diff(varchar(x), time, time) , date_diff(varchar(x), time with time zone, time with time zone) , date_diff(varchar(x), timestamp with time zone, timestamp with time zone) , date_diff(varchar(x), date, date) , date_diff(varchar(x), timestamp, timestamp)"
So then I tried to just focus on getting that "birth_dt" column to change to a valid date to no avail with the following:
select date_parse(birth_dt, '%YYYY-%MM-%dd') as age from table.DB limit 10;
select cast(from_iso8601_timestamp("birth_dt.1") AS date) as age from table.DB limit 10;
SELECT date_parse(birth_dt,'%Y-%m-%dT%H:%i:%s%+00:00') as date_column,
current_timestamp as Todays_Date,
date_diff('day',current_timestamp,date_parse(birth_dt,'%Y-%m-%dT%H:%i:%s%+00:00')) as difference
FROM table.DB limit 10;
These queries end up giving me errors that say either:
"INVALID_FUNCTION_ARGUMENT: Invalid format: "1990-08-15 00:00:00.000" is malformed at "-08-15 00:00:00.000""
or
"INVALID_FUNCTION_ARGUMENT: Invalid format: "1990-08-15 00:00:00.000" is malformed at "00:00:00.000""
What I want to eventually have for my examples above is:
Age
---
31
35
40
51
58
36
38
1
25
Please help! Thank you!!!
Providing valid format is crucial here. Some attempts where really close, try the following:
-- sample data
WITH dataset(birth_dt) AS (
values ('1983-01-03 00:00:00.000'),
('2021-01-03 00:00:00.000')
)
-- query
select date_diff(
'year',
date_parse(birth_dt, '%Y-%m-%d %T.%f'),
current_timestamp - interval '18' month)
from dataset;
Output:
_col0 |
---|
41 |
3 |
Check the MySQL Date Functions docs for all supported format specifiers and what they mean.