start_date | end_date |
---|---|
2018-02-21 | 2018-03-21 |
2018-03-21 | 2018-04-21 |
2018-04-21 | 2018-05-21 |
2018-05-21 | 2018-06-21 |
and I would like to turn it into:
start_date | end_date |
---|---|
2018-02-21 | 2018-02-28 |
2018-03-01 | 2018-03-31 |
2018-04-01 | 2018-04-30 |
2018-05-01 | 2018-05-31 |
2018-06-01 | 2018-06-21 |
You may use this approach:
match_recognize
for it.sequence
function in the correlated subquery.Below is a query:
select
grouped.id, /*Common key for group of rows*/
grouped.grp, /*Sequence number for collapsed interval per ID*/
cast(greatest(grouped.frame_from, mon.dt) as date) as date_start,
least(grouped.frame_to, last_day_of_month(mon.dt)) as date_end
from t
match_recognize (
partition by id /*Restart grouping at each ID*/
order by start_date asc
measures
match_number() as grp,
/*start_date of collapsed interval is the start_date of the first match*/
first(start_date) as frame_from,
/*end_date - end_date of the last match*/
last(end_date) as frame_to
/*Match group is any row followed by any number of consequtive intervals*/
pattern (any_ conseq*)
define
/*Add row to a group if start_date is equal to end_date of the previous row*/
conseq as start_date = prev(end_date)
) as grouped
/*Generate rows for each grouped interval*/
cross join unnest(sequence(
cast(date_trunc('month', frame_from) as timestamp),
cast(frame_to as timestamp),
interval '1' month
)) as mon(dt)
order by 1,2 asc,3 asc
which for this sample data
with t(id, start_date, end_date) as (
values
(1, date '2018-02-21', date '2018-03-21'),
(1, date '2018-03-21', date '2018-04-21'),
(1, date '2018-04-21', date '2018-05-21'),
(1, date '2018-05-21', date '2018-06-21'),
(1, date '2018-06-25', date '2018-08-23'),
(2, date '2022-01-15', date '2022-02-15'),
(2, date '2022-02-15', date '2022-03-15')
)
returns this result in AWS Athena 3 (which is based on Trino):
# | id | grp | date_start | date_end |
---|---|---|---|---|
1 | 1 | 1 | 2018-02-21 | 2018-02-28 |
2 | 1 | 1 | 2018-03-01 | 2018-03-31 |
3 | 1 | 1 | 2018-04-01 | 2018-04-30 |
4 | 1 | 1 | 2018-05-01 | 2018-05-31 |
5 | 1 | 1 | 2018-06-01 | 2018-06-21 |
6 | 1 | 2 | 2018-06-25 | 2018-06-30 |
7 | 1 | 2 | 2018-07-01 | 2018-07-31 |
8 | 1 | 2 | 2018-08-01 | 2018-08-23 |
9 | 2 | 1 | 2022-01-15 | 2022-01-31 |
10 | 2 | 1 | 2022-02-01 | 2022-02-28 |
11 | 2 | 1 | 2022-03-01 | 2022-03-15 |