I'm working on use case where i need to convert the string type to integer type in AWS Athena
Below is my input data -
year month
2022 jan
2022 feb
2012 apr
2023 may
2019 jun
Below is what I'm expecting -
year month yearmon
2022 jan 202201
2022 feb 202202
2012 apr 201204
2023 may 202305
2019 dec 201912
Initially I thought to use case statement wherein I hardcode for each month, concatenate this with year column and finally parse the yearmon as int
. Something like -
concat(year, case when month = 'jan' then '01'
...
... end)
Finally cast it to int
I tried below formats -
Select month("jan")
Select cast("jan" as date)
But nothing seem to have worked.
Is there a better way to cast this use case than hard coding the values in case statement?
You can try using date_parse
and date_format
. The following works in Trino on which the Athena is based on:
-- sample data
with dataset(year, month) as (
values (2022, 'jan'),
(2022, 'feb'),
(2012, 'apr'),
(2023, 'may'),
(2019, 'jun')
)
-- query
select date_format(
date_parse(cast(year as varchar) || '-' || month, '%Y-%b'),
'%Y%m'
)
from dataset;
Which produces the following output (TBH I was a bit surprised that it can handle lowercase month names):
_col0 |
---|
202201 |
202202 |
201204 |
202305 |
201906 |
If this will not work for your actual data then using case-when is the approach you might want to take.