CODE1 CODE2 CODE3 RATE VALUE MONTH
A B C 1 1 202001
A B C 1 1 202002
A B C 1 1 202003
A B C 2 1 202004
A B C 2 1 202005
A B C 1 1 202006
A B C 1 1 202007
A B C 1 1 202008
A B C 1 1 202009
I am working on migrating data from old system to new system. As part of old system data maintained per month and will update the same row if data updated and table contains one row for a month I am migrating to news system and it contains start date and end date to make the active record . So on update new data needs to inserted and updated the old row end date
My expected data
CODE1 CODE2 CODE3 RATE VALUE START_DT END_DT
A B C 1 1 20200101 20200331
A B C 2 1 20200401 20200531
A B C 1 1 20200601 99991230
If the data active and we will update the date as infinity so 999912
But I am getting only two records and my query is below
CODE1 CODE2 CODE3 RATE VALUE START_DT END_DT
A B C 2 1 20200401 20200531
A B C 1 1 20200601 99991230
SELECT CODE1, CODE2, CODE3 RATE, VALUE,
TO_DATE(MIN(bus_month), 'yyyymm') AS START_DT,
last_day(TO_DATE(replace(MAX(bus_month), $CURRENTMONTG, '999912'), 'yyyymm')) AS end_date
FROM TEST_TABLE
GROUP BY CODE1, CODE2, CODE3, RATE, VALUE
SINCE I am grouping based on CODE1, CODE2, CODE3, RATE, VALUE and getting latest data based on grouping and I am not able to get the old data
Please help me to get the expected table structure. Thanks in advance
Please comment if any more details needed
This is a gaps-and-islands problem, where you want to group together "adjacent" rows that have the same rate and value.
One approach uses the difference between the row numbers to build the groups. Assuming that the three codes define the base group, and that you want to break into a new row whenever the rate or the value changes:
select code1, code2, code3, rate, value, min(month) start_dt,
case when row_number() over(partition by code1, code2, code3 order by max(month) desc) = 1 then 999912 else max(month) end end_dt
from (
select t.*,
row_number() over(partition by code1, code2, code3 order by month) rn1,
row_number() over(partition by code1, code2, code3, rate, value order by month) rn2
from mytable t
) t
group by code1, code2, code3, rate, value, rn1 - rn2
order by start_dt
The conditional expression in the outer query sets the end date of the "last" period to 999912
.
CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT :---- | :---- | :---- | ---: | ----: | -------: | -----: A | B | C | 1 | 1 | 202001 | 202003 A | B | C | 2 | 1 | 202004 | 202005 A | B | C | 1 | 1 | 202006 | 999912