I am attempting to write a query that transforms the following table, where Accrual Length is the number of months between and inclusive of each Activation Date and Accrual End Date.
Institution | Protocol | Activation Date | Accrual End Date | Accrual Length |
---|---|---|---|---|
A | 1 | 10-MAR-19 | 14-MAY-19 | 3 |
A | 2 | 12-MAR-19 | 28-APR-19 | 2 |
B | 3 | 18-MAY-20 | 20-MAY-20 | 1 |
C | 4 | 03-SEP-22 | 25-DEC-22 | 4 |
The goal is to transform the table so that each Institution-Protocol pair has a number of rows which equals Accrual Length and where each row lists each month in the date range, like so:
Institution | Protocol | Month |
---|---|---|
A | 1 | 2019-03 |
A | 1 | 2019-04 |
A | 1 | 2019-05 |
A | 2 | 2019-03 |
A | 2 | 2019-04 |
B | 3 | 2020-05 |
C | 4 | 2022-09 |
C | 4 | 2022-10 |
C | 4 | 2022-11 |
C | 4 | 2022-12 |
In the above example, because Institution A / Protocol 1 was active between March 10, 2019 and May 14, 2019, there are 3 rows where each row contains the month and year starting with the Activation Date and ending at the Accrual End Date (March 2019, April 2019, and May 2019).
My current query uses a connect by
clause (this is my first time ever using connect by
) and the level
pseudocolumn to increment through the months, but because level
does not reset to 1 when the Institution-Protocol changes, it simply keeps incrementing and provides incorrect output.
My query is currently written as follows:
select
protocol_no,
institution_name,
activation_date,
accrual_end_date,
accrual_length,
to_char(add_months(activation_date, level-1), 'yyyy-MM') as accrual_month
from (
select
sv_pcl_institution.protocol_no,
institution_name,
case
when institution_name = 'A' then cc.activation_date
else inst.activation_date
end as activation_date,
case
when (status != 'OPEN TO ACCRUAL' and institution_name = 'A') then cc.accrual_end_date
when (status != 'OPEN TO ACCRUAL' and institution_name != 'A') then inst.accrual_end_date
else null
end as accrual_end_date,
case
when institution_name = 'A' then trunc(months_between(nvl(cc.accrual_end_date, SYSDATE), cc.activation_date))+1
else trunc(months_between(nvl(inst.accrual_end_date, SYSDATE), inst.activation_date))+1
end as accrual_length
from sv_pcl_institution
left join (
select
protocol_no,
min(open_from_date) as activation_date,
max(open_thru_date) as accrual_end_date
from sv_pcl_open_status
group by
protocol_no
) cc
on cc.protocol_no = sv_pcl_institution.protocol_no
and sv_pcl_institution.institution_name = 'A'
left join (
select
protocol_no,
institution,
min(inst_open_from_date) as activation_date,
max(inst_open_thru_date) as accrual_end_date
from sv_pcl_inst_open_status
group by
protocol_no,
institution
) inst
on inst.protocol_no = sv_pcl_institution.protocol_no
and inst.institution = sv_pcl_institution.institution_name
where sv_pcl_institution.institution_name is not null
and (cc.activation_date is not null or inst.activation_date is not null)
)
connect by level <= accrual_length
I've determined that the inner query provides the correct output similar to the first example table, but the result from the entire query contains many, many extraneous (and incorrect) rows. How can I rewrite my query to achieve the desired result?
Here's one option, for sample data you posted:
SQL> with temp (institution, protocol, activation_date, accrual_end_date, accrual_length) as
2 (select 'A', 1, date '2019-03-10', date '2019-05-14', 3 from dual union all
3 select 'A', 2, date '2019-03-12', date '2019-04-28', 2 from dual union all
4 select 'B', 3, date '2020-05-18', date '2020-05-20', 1 from dual union all
5 select 'c', 4, date '2022-09-03', date '2022-12-25', 4 from dual
6 )
Query begins here:
7 select institution, protocol,
8 to_char(add_months(trunc(activation_date), column_value - 1), 'yyyy-mm') mon
9 from temp cross join
10 table(cast(multiset(select level from dual
11 connect by level <= accrual_length
12 ) as sys.odcinumberlist))
13 order by 1, 2, 3;
Result:
INSTITUTION PROTOCOL MON
------------ ---------- -------
A 1 2019-03
A 1 2019-04
A 1 2019-05
A 2 2019-03
A 2 2019-04
B 3 2020-05
c 4 2022-09
c 4 2022-10
c 4 2022-11
c 4 2022-12
10 rows selected.
SQL>