I have 2 tables in mssql server.I can perform scd through custom insert/update/delete and also through Merge statement.
I want to know that is there any generic procedure that could server the purpose. we just pass it 2 tables and it should porform the SCD. any option in SQL server 2008? Thanks
No, there isn't and there can't be a generic one suitable for no matter what tables you pass to it. For several reasons:
Not a reason why it's not possible but also consider: For a proper UPSERT
one usually works with temporary tables, the MERGE
statement sucks for SCDs except in special cases. That is because you can't use a MERGE
statement together with an INSERT/UPDATE
and you would have to disable foreign keys for that, since an UPDATE
is implemented as DELETE THEN INSERT
(or something like that, don't remember clearly, but I had those problems when I tried).
I prefer doing it this way (SCD type 2 and SQL Server that is):
Step 1:
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
Step 2:
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;
SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
The ISNULL(d.sid, 0)
in step 2 is important. It returns the surrogate id of your dimension, if an entry already exists, otherwise 0.
Step 3:
UPDATE D_yourDimensionName SET
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn
)
;
In Step 3 you mark the existing entry as not valid anymore and keep a history of it. The valid entry you get with WHERE validTo IS NULL
.
You can also add another UPDATE
to overwrite any other column with the new value if needed.
Step 4:
INSERT INTO D_yourDimensionName
SELECT * FROM tmpDimYourDimensionName
WHERE sid = 0;
And that's it.