Search code examples
data-warehouseobieestar-schemasnowflake-schema

Snowflaking Date dimension


In my star schema, I have a project dimension which has columns such as start_date, finish_date, service_date, onhold_date, resume_date etc.

Should I introduce foreign keys for all the dates in the fact table and connect them to a date dimension or should I snowflake the project_dimension with date_dimension? Not all the dates are available for a given project so keeping all these columns in a fact_table may result in having null keys in fact_table.

What is the best way to handle dates in this scenario?


Solution

  • In a data warehouse, I always prefer a general star schema, snowflaked as little as possible, although this is obviously a bit of personal preference, and can depend on what environment you are using. For Oracle (the environment I am most used to) it supports snowflaking physically, but best practice denotes not to snowflake the business model (logical) layer.

    Personally, I would push for putting the FKs on the fact for a few reasons. One, that maintains a star, which generally performs better as snowflakes introduce more joins, and stars handle aggregation quicker. Two, if you have users combining this data with data from other facts, having a conformed date dimension just makes sense, can help query performance, and is more robust. Finally, stars are probably most common, so having others work on this area in the future should be easier/the data may work better with other applications in the future.

    For null FKs, I would default to whatever default date your system has, for us, our unspecified record is 01/01/1901. I would not leave them null, unless it is desired to not see 1901 by business users, and even then, I would probably null them out with a case statement, but still leave the field filled on the table.

    Here is a good article describing the advantages/disadvantages of each type. Like I said, neither is completely right or wrong.

    http://www.dataonfocus.com/star-schema-and-snowflake-schema/