I have a project where establishments are inspected anything from once every 6 months to once every 3 years and the results of the inspection scorecard are recorded as a record in a type 2 slowly changing dimension table [tblInspections]
, using StartDate
and EndDate
to cover the period between inspections for which this scorecard is valid. The inspections table is linked to [tblEstablishments] which contains other details about other fixed dimensions such as location and business type.
So currently, we are providing aggregated reports of current situation (where EndDate is null
) and also audit reports for the history of any one establishment (On EstablishmentID
)
My next task is to provide more detailed analysis reports of trends of the scorecard results and I need to provide historical aggregated results of the situation on the last day of each month.
My problem is that despite knowing exactly what I want, I am now unsure how to get there.
1) Do I start by writing ETL process to build a cube based on all the historical results working out what all the aggregates would have been at the end of each month?
2) Am I then able to just process the current records at the end of each month effectively add a new slice onto the end of an existing cube without reprocessing from scratch? (if so how?)
3) Is there another way of doing this? Does Analysis Services have better ways of dealing with SCDs automatically when determining historical status at any point in time by selecting the correct record from multiple records with start and end date?
Any advice and pointers to tutorials related to this would be much appreciated.
First I think you are going to want to build a new periodic (monthly) snapshot fact table if you are trying to analyze the inspection results across establishments (and other dimensions, like time/date). Then you can build the ETL process to populate this new fact table. Finally, you can model the fact table as a new measure group in a new or existing cube...be sure to pay attention to the aggregation property of the measures in this new measure group...typically you don't want to sum periodic snapshot measures (think about what happens if you sum your bank account balance at the end of each month and look at it by year).
Yes, you will run your ETL at the end of each month which will had more rows to your periodic (monthly) snapshot fact table. Then you can just process the cube and you are all set.
Analysis Services handles SCD2 Dimensions quite well (assuming you are using Surrogate Keys...you are aren't you?). I think the business process that you are trying to model (Inspections)...is what is causing some confusion because it's no longer a dimension in this new analysis, it has become a fact (a periodic snapshot fact)