Search code examples
database-designdata-warehousefact-table

Mini-dimension and DateTime


I am designing my first DWH system and have ran into a use-case which I did not find in any books/articles.

Let's say that I have following dimensions:

  • Student
  • Subject

And I have the a single fact table:

  • Grading

The dimension called Student is the one that I am having issue with. It contains all basic info columns which are considered constant and non-updateable. However, I have 4 columns which describe current residence of a student: Address, Town, Fax number and Country. These are prone to change.

During my analyses on how to resolve this, I relied heavily on Kimball Group's articles and I understood that can be solved by using Slow changing dimensions Type 5 - by adding a mini-dimension to the Student dimension.

This all made lot sense to me, however, here is the part that puzzles me totally: None of the examples that I was able to get hold of have DateTime in their mini-dimension.

If I understood it right, we need to be able to track changes to mini-dimension records over time, so recording a simple change (without DT) would just not cut it, right?

Can someone help me understand this better? Can, by the book, mini-dimension contain a DT?

P.S. I would have expected something like a Type 6, but without historical columns.


Solution

  • If I'm following it right (“type 5” is new to me), they call it type 5 because it’s type 4 + type 1. This gives us:

    Type 4, a star schema that includes the “base” dimension (Student) and a mini dimension (call it Residence). These dimensions are related to the fact table, but not to each other.

    Type 5, you still have the two dimensions, but now you add a key to the (Residence) mini dimension to the “base” dimension (Student)… and you treat it as any other fact in a type 1 slowly changing dimension, in that the data is replaced/overwritten when it changes. Type 1 dimensions do not record that the data has changed, and thus do not track when the data changes, so the old value is lost.

    Where/how to factor in time would seem to depend on what you are modelling. If Time is part of the fact table(s), with type 4 the Student’s current Residence when they took a give Subject can be found through joins. Awkward, but perhaps adequate? If you need to know what Residence a Student was in when they took a Subject, type 2 slowly changing dimensions strongly suggest themselves. If Residence is rarely a factor, snowflaking off of Student with Residence and a StudentResidence lookup table might be acceptable.

    (If we’re talking dorm rooms here, your Residence table might not get too big—but Fax Number really throws me off. Faxes were kind of old back at the turn of the century, what student/educational system these days requires them?)