SQL lag to row which meets condition

I have a table which contains measures taken on random dates, partitioned by the site at which they were taken.

site	date	measurement
AB1234	2022-12-09	1
AB1234	2022-06-11	2
AB1234	2019-05-22	3
AB1234	2017-01-30	4
CD5678	2022-11-01	5
CD5678	2020-04-10	6
CD5678	2017-04-10	7
CD5678	2017-01-22	8

In order to calculate a year on year growth, I want to have an additional field for each record which contains the previous measurement at that site. The challenging part is that I only want the previous which occurred more than a year in the past.

Like so:

site	date	measurement	previous_measurement
AB1234	2022-12-09	1	3
AB1234	2022-06-11	2	3
AB1234	2019-05-22	3	4
AB1234	2017-01-30	4	NULL
CD5678	2022-11-01	5	6
CD5678	2020-04-10	6	7
CD5678	2017-04-10	7	NULL
CD5678	2017-01-22	8	NULL

It feels like it should be possible with a window function, but I can't work it out.

Please help :(

Solution

Amazon Athena engine version 3 incorporated from Trino. If it has incorporated full support for frame type RANGE for window functions you can use that:

-- sample data
with dataset(site, date, measurement) as (
    values ('AB1234', date '2022-12-09', 1),
    ('AB1234', date '2022-06-11', 2),
    ('AB1234', date '2019-05-22', 3),
    ('AB1234', date '2017-01-30', 4),
    ('CD5678', date '2022-11-01', 5),
    ('CD5678', date '2020-04-10', 6),
    ('CD5678', date '2017-04-10', 7),
    ('CD5678', date '2017-01-22', 8)
)

-- query
select *,
    last_value(measurement) over (
        partition by site
        order by date
        RANGE BETWEEN UNBOUNDED PRECEDING AND interval '1' year PRECEDING)
from dataset;

Output:

site	date	measurement	_col3
CD5678	2017-01-22	8	NULL
CD5678	2017-04-10	7	NULL
CD5678	2020-04-10	6	7
CD5678	2022-11-01	5	6
AB1234	2017-01-30	4	NULL
AB1234	2019-05-22	3	4
AB1234	2022-06-11	2	3
AB1234	2022-12-09	1	3