Search code examples
sqlamazon-athenaprestotrino

How to create a lag year variable in Amazon Athena Environment using SQL


How can I generate a table from wide to long with some specific conditions?

This is my original data, each id has three columns, year is their index year and y-1 is the year prior to the index year, y-2 is 2 years prior to the index year, each id only has 1 row

Then I wish to reshape the table, from wide to long, make only a variable named year, a index year, with previous year and 2 years prior to the index year but with one more condition. I would like to expand the year moving forward to year 2021 for each bene_id. Something like below.

This is what I am looking for..:

This is what I am looking for..

Can anyone give me some advice?

I use below codes but still got wrong table.. note: I generate the codes in Amazon Athena

  WITH RECURSIVE YearSequence(bene_id, Year) AS (
    SELECT bene_id, year
    FROM tablea 
    UNION ALL
    SELECT ts.bene_id, YearSequence.Year - 1
    FROM YearSequence
    JOIN tablea  AS ts ON YearSequence.bene_id = ts.bene_id
    WHERE YearSequence.Year > 2016 
  )
  SELECT bene_id, Year
  FROM YearSequence
  WHERE Year <= 2021
  ORDER BY bene_id, Year
);

Solution

  • No need to use recursive here, sequence + unnest should do the trick. Assuming that y (y-2 in you data) is always the "start" year something like the following:

    -- sample data, a bit simplified
    with dataset(id, year, y) as(
        values ('A', 2018, 2017),
               ('B', 2019, 2018)
    )
    
    -- query
    select id, t.year
    from dataset,
         unnest(sequence(y, 2021)) as t(year); -- or sequence(year-2, 2021) for your case
    

    Output:

    id year
    A 2017
    A 2018
    A 2019
    A 2020
    A 2021
    B 2018
    B 2019
    B 2020
    B 2021