Search code examples
sqldatessasdimensional-modeling

Should a Date Dimension be snowflaked?


Whenever some data in a star schema is available on day level, and other on month level, should I create a separate Month dimension, or are there situations where it is preferred to attach the month level-data to the day-level date dimension (for example always on the first day of the month)?

Example:

Let's say I've got a fact table storing orders on day level, attached to a date dimension storing dates on date level.

+---------+      +-----------+
|  Fact   |      | Dim       |
| Orders  |      | Date      |
+---------+      +-----------+
| FK Date |------| FullDate  |
+---------+      | MonthNr   |
                 | Year      |
                 +-----------+

Solution 1: Snowflake the Date dimension, creating a separate Month dimension

Now I've got some sales targets per month that I need to include in the model. Intuitively, my reaction is to create a dimension on month-level, as a snowflake. My date dimension would then contain a FK to this month dimension, so the hierarchy is explicit.

+---------+      +-----------+
|  Fact   |      | Dim       |
| Orders  |      | Date      |
+---------+      +-----------+      +-----------+      
| FK Date |------| FullDate  |      | Dim Month |      +----------+
+---------+      | Year      |      +-----------+      |  Target  |
                 | Month     |      | Year      |      +----------+
                 | FK Month  |------| YearMonth |------| FK Month |
                 +-----------+      | Month     |      | Target   |
                                    +-----------+      +----------+           

note that the exact set of keys in the fact table (as well as the remaining columns in the date dimension) can be implemented in several ways depending on the choice for optimal, hybrid or standard snowflake

Solution 2: Attach the month-level data to the first day of the month

More recently I've seen more and more people attaching the targets to the original Date dimension, on the first of the month:

+---------+      +-----------+
|  Fact   |      | Dim       |      +----------+
| Orders  |      | Date      |      |  Target  |
+---------+      +-----------+      +----------+
| FK Date |------| FullDate  |------| FK Date  |
+---------+      | MonthNr   |      | Target   |
                 | Year      |      +----------+
                 +-----------+

This last datamodel is especially used whenever the Data Model is primarily used for direct access by customers (for example ad hoc analysis in cross- or pivot tables). To me, it still feels as going against the grain.

What is the best way to model these targets on month level? What are important considerations for choosing one approach over the other?

Update

As @SebTHU points out in his answer, it is possible to model inside SSAS (and, as @Nick.McDermaid adds, in Cognos too) a single dimension (only Date) where, thanks to knowledge about a hierarchy, the targets relate to the month-level, while orders relate to the day level, like this:

+---------+      +-----------+
|  Fact   |      | Dim       |
| Orders  |      | Date      |      +-------------+
+---------+      +-----------+      | Fact Target |
| FK Date |------| FullDate  |      +-------------+
+---------+      | MonthKey  |------| FK MonthKey |
                 | MonthNr   |      | Target      |
                 | Year      |      +-------------+
                 +-----------+

The question that remains is how to implement this in a relational database (say SQL Server) - a Foreign Key isn't allowed to a non-unique attribute. Is it then okay to take solution 2, and re-model this during cube load? Or is it better to take solution 1 (eventually without the FK relation between Dim Month and Dim Date) and 'merge' the two dimensions into a single one during the cube load?


Solution

  • It's possible (in SSAS2014, anyway) to create a Month->Day hierarchy in your Date dimension, then have the dimension slice two measure groups on different granularity attributes.

    So you'd relate the measuregroup with Orders in it to the Date dimension on the Day attribute, but relate the measuregroup with the Targets to the same dimension on the Month attribute.

    The result would be that Targets would have no value on individual days, but would have a value on Months (and on anything higher up in the hierarchy than Months, e.g. quarters or years, if you want those). I think from reading your question that this is the result you're looking for?

    EDIT: SSAS default behaviour is that measures defined on a higher-level dimension member show the same value for all the dimension member's children. So your monthly Target of 1000 would also show as 1000 for each of the days in the month. It's easy enough to over-ride this using a calculated member