Search code examples
sql-servert-sqlmergescd2

SQL Merge to Update table with changed history


I have been tasked with building a history table in SQL. I have already built the base table which contains multiple left joins amongst other things. The base table will need to be compared to another table and only update specific columns that have changed, insert new rows where the key doesn't match.

Previously I have used other ETL tools which have GUI style built in SCD loaders, but I don't have such luxury in SQL Server. Here the merge statement can handle such operations. I have used the MERGE statement before, but I become a bit stuck when handling flags and date fields based on the operation performed.

Here is the BASE table

KEY CLIENT QUANTITY CONTRACT_NO FC_COUNT DELETE_FLAG RECORD_UPDATED_DATE
345 A 1000 5015 1 N 31/12/9999
346 B 2000 9352 1 N 31/12/9999
347 C 3000 6903 1 N 31/12/9999
348 D 1000 7085 1 N 31/12/9999
349 E 1000 8488 1 N 31/12/9999
350 F 500 6254 1 N 31/12/9999

Here is the table I plan to merge with

KEY CLIENT QUANTITY CONTRACT_NO FC_COUNT
345 A 1299 5015 1
346 B 2011 9352 1
351 Z 5987 5541 1

The results I'm looking for are

KEY CLIENT QUANTITY CONTRACT_NO FC_COUNT DELETE_FLAG RECORD_UPDATED_DATE
345 A 1000 5015 1 N 06/07/2022
345 A 1299 5015 1 N 31/12/9999
346 B 2000 9352 1 N 06/07/2022
346 B 2011 9352 1 N 31/12/9999
347 C 3000 6903 1 Y 06/07/2022
348 D 1000 7085 1 Y 06/07/2022
349 E 1000 8488 1 Y 06/07/2022
350 F 500 6254 1 Y 06/07/2022
351 Z 5987 5541 1 N 31/12/9999

As we can see I have shown the changes, closed off the old records, marked with a date and a delete flag if they are missing but was there previous, as well as new new row with the new key and data

Would this be a MERGE? Some direction on how to perform this sort of operation would be a great help. We have a lot of tables where we need to keep change history and this will help a lot going forward.

code shell attempt

SELECT      
            MAIN_KEY,
            CLIENT,
            QUANTITY,
            CONTRACT_NO,
            1 AS FC_COUNT,
            NULL as DELETE_FLG_DD,
            GETDATE() as RECORD_UPDATED_DATE
INTO  #G1_DELTA
FROM
            [dwh].STG_DTL
MERGE [dwh].[PRJ1_DELTA] TARGET
USING #G1_DELTA SOURCE
ON   TARGET.MAIN_KEY                    = SOURCE.MAIN_KEY
    
WHEN MATCHED THEN INSERT 
    (
        MAIN_KEY,
        CLIENT,
        QUANTITY,
        CONTRACT_NO,
        FC_COUNT,
        DELETE_FLG_DD,
        RECORD_UPDATED_DATE
    )
VALUES
    (
        SOURCE.MAIN_KEY,
        SOURCE.CLIENT,
        SOURCE.QUANTITY,
        SOURCE.CONTRACT_NO,
        SOURCE.FC_COUNT,
        SOURCE.DELETE_FLG_DD,
        SOURCE.RECORD_UPDATED_DATE
        )

Solution

  • If you need to build a history table containing the updated information from your two tables, you first need to select updated information from your two tables.

    The changes that need to be applied to your tables are on:

    • "tab1.[DELETE_FLAG]", that should be updated to 'Y' whenever it has a match with tab2
    • "tab1.[RECORD_UPDATED_DATE]", that should be updated to the current date
    • "tab2.[DELETE_FLAG]", missing and that should be initialized to N
    • "tab2.[RECORD_UPDATED_DATE]", missing and that should be initialized to your random date 9999-12-31.

    Once these changes are made, you can apply the UNION ALL to get the rows from your two tables together.

    Then, in order to generate a table, you can use a cte to select the output result set and use the INTO <table> clause after a selection to generate your "history" table.

    WITH cte AS (
        SELECT tab1.[KEY],
               tab1.[CLIENT],
               tab1.[QUANTITY],
               tab1.[CONTRACT_NO],
               tab1.[FC_COUNT],
               CASE WHEN tab2.[KEY] IS NOT NULL
                    THEN 'N'
                    ELSE 'Y'
               END       AS [DELETE_FLAG],
               CAST(GETDATE() AS DATE) AS [RECORD_UPDATED_DATE]
        FROM      tab1
        LEFT JOIN tab2
               ON tab1.[KEY] = tab2.[KEY] 
    
        UNION ALL 
    
        SELECT *, 
               'N'          AS [DELETE_FLAG],
               '9999-12-31' AS [RECORD_UPDATED_DATE]
        FROM tab2
    )
    SELECT *
    INTO history
    FROM cte
    ORDER BY [KEY];
    

    Check the demo here.