I am using Kimball methodology for slowly changing dimensions in our datawarehouse. We have run into problems where a record is being updated multiple times in one day and the ETL runs every 15 minutes. So the Effective and Expiry Dates are getting messed up.
For example:
First Last Effective Date Expiry Date CurrRowInd
John Smith Jan 01, 1900 Aug 5, 2015 N
Jonathon Smith Aug 6, 2015 Aug 10, 2015 N
Jonathon Smithe Aug 11, 2015 Aug 10, 2015 Y
See how the last row has an EffectiveDate greater than the Expiry Date - this isn't good :(
Has anyone else encountered this type of issue - and how did you handle it?
I am using the Slowly-Changing Dimension Tool in SSIS
Either you have to add Time to the validity dates in the source table (which isn't really that difficult to do), or adjust the code which sets the Effective Date/Expiry Date when someone makes a change (this may be a trigger, for example).
Your convention about validity dates seems to be that a row is valid from the Effective Date inclusive (in time terms, from Effective Date 00:00:00) to Expiry Date inclusive (in time terms, up to Expiry Date 23:59:59). So the code which updates these dates would have to set an earlier, now completely-invalidated update on the same day to:
EffectiveDate: today ExpiryDate: day before.
Obviously the ETL for the dimension would have to be consistent with this, and ignore rows where EffectiveDate>ExpiryDate.