Search code examples
sqlamazon-web-servicesamazon-athenaprestotrino

Bring the last date (result)


I'm trying to bring the mostly recent date with the query below:

select id, amount, yearmonthday
from ( 
select 
id,
amount,
yearmonthday,
row_number() OVER (PARTITION BY id ORDER BY try(date_parse(yearmonthday, '%Y-%m-%d'))) as row_num 
from "db_raw"."table" ) ranked
where row_num = 1 and id = 1

But every time i run the query, brings me a different result not the last date

run for the first time, brings what i want:

id amount yearmonthday
1 10000 20231213

if i run again, brings me other result:

id amount yearmonthday
1 9000 20230327

Solution

  • First of all you need to sort in descending order to get the most resent date (assuming dates in the past):

    row_number() OVER (PARTITION BY id 
         ORDER BY try(date_parse(yearmonthday, '%Y-%m-%d')) desc) as row_num
    

    default sorting order is ascending.

    Also note that:

    • You can have several rows for the same date per partition in data, in this case you will get a random one amongst the "duplicates"
    • try(...) will swallow parsing errors resulting in null so make sure that date it actually parsed otherwise you will not actually order the output. Add try(date_parse(yearmonthday, '%Y-%m-%d') to the select results and see if it is not null (also you can count all unparsable rows with select count(*) from ... where try(date_parse(yearmonthday, '%Y-%m-%d') is null)

    UPD

    The second note was correct - you parse date using incorrect format, yearmonthday does not have delimiters, so remove them:

    row_number() OVER (PARTITION BY id 
         ORDER BY try(date_parse(yearmonthday, '%Y%m%d')) desc) as row_num