Search code examples
sqloracle-databaseoracle12c

Oracle - I want to find out delta changes in a database table


The table is loaded every night with some data (around 100K rows). Then my batch job reads each record from the table and processes it. My requirement is to only process those records who have changes into specific columns because there is no point in processing a new record if it does not have any changes compared to yesterday's record (and it unnecessary delays whole processing of the job).

Please assume below table structure

Table Name: STAGING_T

Column Names: PK_1 | COL1 | COL2 | COL3 | COL4 | CREATE_DATE | FLAG

CREATE_DATE is the sysdate when the data is loaded into this table. This is used to identify day wise data loaded into this table.

  1. If a row is present in today's data but not in yesterday's(previous day's), update FLAG to 'INSERT'
  2. If a row is present in yesterday's data but not in today's, update the FLAG to 'DELETE'
  3. If a row is present in both today's and yesterday's data and there is any change in COL3 and COL4 of both the records, update the FLAG to 'UPDATE'.
  4. If a row is present in both today's and yesterday's data and there is NO change in COL3 and COL4 of both the records, update the FLAG to 'NO CHANGE'.

Common columns based on which I identify the same records of yesterday and today are COL1 and COL2.

**Sample Data**
PK_1    COL1    COL2    COL3    COL4    CREATE_DATE FLAG
1       1000    2000    a       x       31.01.2019  
2       1000    2001    b       y       31.01.2019  
3       1000    2002    c       z       31.01.2019  
4       1000    2000    aa      x       30.01.2019  
5       1000    2001    b       y       30.01.2019  
6       1000    2003    d       z       30.01.2019  


**Expected Output**

PK_1    COL1    COL2    COL3    COL4    CREATE_DATE FLAG
1       1000    2000    a       x       31.01.2019  UPDATE
2       1000    2001    b       y       31.01.2019  NO CHANGE
3       1000    2002    c       z       31.01.2019  INSERT
4       1000    2000    aa      x       30.01.2019  
5       1000    2001    b       y       30.01.2019  
6       1000    2003    d       z       30.01.2019  

Any help or suggestions would be great. Thank you.


Solution

  • You can first generate the desired values and then use merge statement. You may try below -

    MERGE INTO DATA T 
    USING (SELECT D.PK_1, D.COL1, D.COL2, D.COL3, CREATE_DATE, D.COL4,
           CASE WHEN CREATE_DATE = (SELECT MAX(CREATE_DATE) FROM DATA) THEN
                CASE WHEN LAG(COL3 || '`'||COL4) OVER(PARTITION BY COL1,COL2 ORDER BY CREATE_DATE) IS NULL
                          THEN 'INSERT'
                     WHEN LAG(COL3 || '`'||COL4) OVER(PARTITION BY COL1,COL2 ORDER BY CREATE_DATE)  = COL3 || '`'||COL4
                          THEN 'NO CHANGE'
                     ELSE 'UPDATE'
                END
           END FLAG
           FROM DATA D)
    S ON (T.PK_1 = S.PK_1)
    WHEN MATCHED THEN UPDATE SET T.FLAG = S.FLAG;
    

    Here is the demo.