Search code examples
sql-serverssisetldata-warehousescd

Slowly Changing Dimension - Effective/Expiry Date corruption when running multiple times in one day


I am using Kimball methodology for slowly changing dimensions in our datawarehouse. We have run into problems where a record is being updated multiple times in one day and the ETL runs every 15 minutes. So the Effective and Expiry Dates are getting messed up.

For example:

First      Last    Effective Date    Expiry Date   CurrRowInd
John       Smith   Jan 01, 1900      Aug 5, 2015   N
Jonathon   Smith   Aug 6, 2015       Aug 10, 2015  N
Jonathon   Smithe  Aug 11, 2015      Aug 10, 2015  Y

See how the last row has an EffectiveDate greater than the Expiry Date - this isn't good :(

Has anyone else encountered this type of issue - and how did you handle it?

I am using the Slowly-Changing Dimension Tool in SSIS


Solution

  • Either you have to add Time to the validity dates in the source table (which isn't really that difficult to do), or adjust the code which sets the Effective Date/Expiry Date when someone makes a change (this may be a trigger, for example).

    Your convention about validity dates seems to be that a row is valid from the Effective Date inclusive (in time terms, from Effective Date 00:00:00) to Expiry Date inclusive (in time terms, up to Expiry Date 23:59:59). So the code which updates these dates would have to set an earlier, now completely-invalidated update on the same day to:

    EffectiveDate: today ExpiryDate: day before.

    Obviously the ETL for the dimension would have to be consistent with this, and ignore rows where EffectiveDate>ExpiryDate.