Search code examples
sqloracle-databasedatetimewindow-functionsgaps-and-islands

Oracle SQL data migration row to column based in month


CODE1   CODE2   CODE3   RATE    VALUE   MONTH
A       B       C       1       1       202001
A       B       C       1       1       202002
A       B       C       1       1       202003
A       B       C       2       1       202004
A       B       C       2       1       202005
A       B       C       1       1       202006
A       B       C       1       1       202007
A       B       C       1       1       202008
A       B       C       1       1       202009

I am working on migrating data from old system to new system. As part of old system data maintained per month and will update the same row if data updated and table contains one row for a month I am migrating to news system and it contains start date and end date to make the active record . So on update new data needs to inserted and updated the old row end date

My expected data

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       1       1       20200101    20200331
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230

If the data active and we will update the date as infinity so 999912

But I am getting only two records and my query is below

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230


SELECT CODE1, CODE2, CODE3 RATE, VALUE,
 TO_DATE(MIN(bus_month), 'yyyymm') AS START_DT,
 last_day(TO_DATE(replace(MAX(bus_month), $CURRENTMONTG, '999912'), 'yyyymm')) AS end_date
FROM TEST_TABLE
GROUP BY CODE1, CODE2, CODE3, RATE, VALUE

SINCE I am grouping based on CODE1, CODE2, CODE3, RATE, VALUE and getting latest data based on grouping and I am not able to get the old data

Please help me to get the expected table structure. Thanks in advance

Please comment if any more details needed


Solution

  • This is a gaps-and-islands problem, where you want to group together "adjacent" rows that have the same rate and value.

    One approach uses the difference between the row numbers to build the groups. Assuming that the three codes define the base group, and that you want to break into a new row whenever the rate or the value changes:

    select code1, code2, code3, rate, value, min(month) start_dt, 
        case when row_number() over(partition by code1, code2, code3 order by max(month) desc) = 1 then 999912 else max(month) end end_dt
    from (
        select t.*,
            row_number() over(partition by code1, code2, code3 order by month) rn1,
            row_number() over(partition by code1, code2, code3, rate, value order by month) rn2
        from mytable t
    ) t
    group by code1, code2, code3, rate, value, rn1 - rn2
    order by start_dt
    

    The conditional expression in the outer query sets the end date of the "last" period to 999912.

    Demo on DB Fiddle:

    CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT
    :---- | :---- | :---- | ---: | ----: | -------: | -----:
    A     | B     | C     |    1 |     1 |   202001 | 202003
    A     | B     | C     |    2 |     1 |   202004 | 202005
    A     | B     | C     |    1 |     1 |   202006 | 999912