I'm new to multidimensional data warehousing and have been tasked my workplace in developing a data warehousing solution for reporting purposes, so this might be a stupid question but here it goes ...
Each record in my fact table have FK columns that link out to their respective dimension tables (ex. dimCustomer, dimGeography, dimProduct).
When loading the data warehouse during the ETL process, I first loaded up the dimension tables with the details, then I loaded the fact table and did lookup transformations to find the FK values to put in the fact table. In doing so it seems each row in the fact table has FKs of the same value (ex. row1 has a FK of 1 in each column across the board, row2 has value 2... etc.)
I'm just wondering if this is typical or if I need to rethink the design of the warehouse and ETL process.
Any suggestions would be greatly appreciated.
Thanks
Based on your comments, it sounds like there's a missed step in your ETL process.
For a call center / contact center, I might start out with a fact table like this:
CallFactID - unique key just for ETL purposes only
AssociateID - call center associate who initially took the call
ProductID - product that the user is calling about
CallTypeID - General, Complaint, Misc, etc
ClientID - company / individual that is calling
CallDateID - linked to your Date (by day) Dimension
CallTimeOfDayID - bucketed id for call time based on business rules
CallStartTimestamp - ANSI timestamp of start time
CallEndTimestamp - ANSI timestamp of end time
CallDurationTimestamp - INTERVAL data type, or integer in seconds, call duration
Your dimension tables would then be:
AssociateDim
ProductDim
CallTypeDim
ClientDim
DateDim
TimeOfDayDim
Your ETL will need to build the dimensions first. If you have a relational model in your source system, you would typically just go to the "lookup" tables for various things, such as the "Products" table or "Associates" table, and denormalize any relationships that make sense to be included as attributes. For example, a relational product table might look like:
PRODUCTS: ProductKey,
ProductName,
ProductTypeKey,
ProductManufacturerKey,
SKU,
UPC
You'd denormalize this into a general product dimension by looking up the product types and manufacturer to end up with something like:
PRODUCTDIM: PRODUCTID (DW surrogate key),
ProductKey,
ProductName,
ProductTypeDesc,
ManufacturerDesc,
ManufacturerCountry,
SKU,
UPC
For attributes that are only on your transaction (call record) tables but are low cardinality, you can create dimensions by doing SELECT DISTINCT
on the these tables.
Once you have loaded all the dimensions, you then load the fact by doing a lookup against each of the dimensions based on the natural keys (which you've preserved in the dimension), and then assign that key to the fact row.
For a more detailed guide on ETL with DW Star Schemas, I highly recommend Ralph Kimball's book The Data Warehouse ETL Toolkit.