I have been tasked with building a history table in SQL. I have already built the base table which contains multiple left joins amongst other things. The base table will need to be compared to another table and only update specific columns that have changed, insert new rows where the key doesn't match.
Previously I have used other ETL tools which have GUI style built in SCD loaders, but I don't have such luxury in SQL Server. Here the merge statement can handle such operations. I have used the MERGE
statement before, but I become a bit stuck when handling flags and date fields based on the operation performed.
Here is the BASE table
KEY | CLIENT | QUANTITY | CONTRACT_NO | FC_COUNT | DELETE_FLAG | RECORD_UPDATED_DATE |
---|---|---|---|---|---|---|
345 | A | 1000 | 5015 | 1 | N | 31/12/9999 |
346 | B | 2000 | 9352 | 1 | N | 31/12/9999 |
347 | C | 3000 | 6903 | 1 | N | 31/12/9999 |
348 | D | 1000 | 7085 | 1 | N | 31/12/9999 |
349 | E | 1000 | 8488 | 1 | N | 31/12/9999 |
350 | F | 500 | 6254 | 1 | N | 31/12/9999 |
Here is the table I plan to merge with
KEY | CLIENT | QUANTITY | CONTRACT_NO | FC_COUNT |
---|---|---|---|---|
345 | A | 1299 | 5015 | 1 |
346 | B | 2011 | 9352 | 1 |
351 | Z | 5987 | 5541 | 1 |
The results I'm looking for are
KEY | CLIENT | QUANTITY | CONTRACT_NO | FC_COUNT | DELETE_FLAG | RECORD_UPDATED_DATE |
---|---|---|---|---|---|---|
345 | A | 1000 | 5015 | 1 | N | 06/07/2022 |
345 | A | 1299 | 5015 | 1 | N | 31/12/9999 |
346 | B | 2000 | 9352 | 1 | N | 06/07/2022 |
346 | B | 2011 | 9352 | 1 | N | 31/12/9999 |
347 | C | 3000 | 6903 | 1 | Y | 06/07/2022 |
348 | D | 1000 | 7085 | 1 | Y | 06/07/2022 |
349 | E | 1000 | 8488 | 1 | Y | 06/07/2022 |
350 | F | 500 | 6254 | 1 | Y | 06/07/2022 |
351 | Z | 5987 | 5541 | 1 | N | 31/12/9999 |
As we can see I have shown the changes, closed off the old records, marked with a date and a delete flag if they are missing but was there previous, as well as new new row with the new key and data
Would this be a MERGE
? Some direction on how to perform this sort of operation would be a great help. We have a lot of tables where we need to keep change history and this will help a lot going forward.
code shell attempt
SELECT
MAIN_KEY,
CLIENT,
QUANTITY,
CONTRACT_NO,
1 AS FC_COUNT,
NULL as DELETE_FLG_DD,
GETDATE() as RECORD_UPDATED_DATE
INTO #G1_DELTA
FROM
[dwh].STG_DTL
MERGE [dwh].[PRJ1_DELTA] TARGET
USING #G1_DELTA SOURCE
ON TARGET.MAIN_KEY = SOURCE.MAIN_KEY
WHEN MATCHED THEN INSERT
(
MAIN_KEY,
CLIENT,
QUANTITY,
CONTRACT_NO,
FC_COUNT,
DELETE_FLG_DD,
RECORD_UPDATED_DATE
)
VALUES
(
SOURCE.MAIN_KEY,
SOURCE.CLIENT,
SOURCE.QUANTITY,
SOURCE.CONTRACT_NO,
SOURCE.FC_COUNT,
SOURCE.DELETE_FLG_DD,
SOURCE.RECORD_UPDATED_DATE
)
If you need to build a history table containing the updated information from your two tables, you first need to select updated information from your two tables.
The changes that need to be applied to your tables are on:
'Y'
whenever it has a match with tab2N
9999-12-31
.Once these changes are made, you can apply the UNION ALL
to get the rows from your two tables together.
Then, in order to generate a table, you can use a cte to select the output result set and use the INTO <table>
clause after a selection to generate your "history" table.
WITH cte AS (
SELECT tab1.[KEY],
tab1.[CLIENT],
tab1.[QUANTITY],
tab1.[CONTRACT_NO],
tab1.[FC_COUNT],
CASE WHEN tab2.[KEY] IS NOT NULL
THEN 'N'
ELSE 'Y'
END AS [DELETE_FLAG],
CAST(GETDATE() AS DATE) AS [RECORD_UPDATED_DATE]
FROM tab1
LEFT JOIN tab2
ON tab1.[KEY] = tab2.[KEY]
UNION ALL
SELECT *,
'N' AS [DELETE_FLAG],
'9999-12-31' AS [RECORD_UPDATED_DATE]
FROM tab2
)
SELECT *
INTO history
FROM cte
ORDER BY [KEY];
Check the demo here.