Search code examples
data-warehousedimensional-modelingstar-schemafactscd

Understanding Slowly Changing Dimension Type 2


I am having difficult time understanding how to use slowly changing dimension type 2, in my scenario. I have gone through different tutorial websites but they don't fit.

I have an employee dimension table containing:

+-----+---------------+------------+------------+
| id  | employee      | designation| Location   |
+-----+---------------+------------+------------+
| 1   | Ola           |   CEO      | Newyork    |
| 2   | Ahmed         | DEVELOPER  | California |
| 3   | Ola           | Manager    | California |
+----------+----------+------------+------------+

I have a Account Fact table

+-------+----------+
|emp_id | Amount   | 
+-------+-----------
| 1     | 2000000  |  
| 2     | 300000   | 
+----------+-------+

Now we see that the dimension has changed, and thus a new ID to same Ola employee has been given. How would we manage in the fact table?

  • The new ID of Ola will not be found in Fact Table.

  • so if we add a new row in fact, with new ID of Ola, how would we link that they are same employee, when they are identified differently, 'primary key'.

  • How would we distinguish this employee is not a new employee and actually location / designation has been changed.


Solution

  • I am sure there are many ways of doing it, here's one way - Have an "employee_Key" in your dimension Table which is unique for an employee. So your dimension table will look like this -

     id  | emp_key | employee   | designation| Location   |Valid From| Valid To |
    -----|---------|------------|------------|------------|----------|----------|
     1   | EMP1    | Ola        |   CEO      | Newyork    |1/1/1900  |1/1/2016  |
     2   | EMP2    | Ahmed      | DEVELOPER  | California |1/1/1900  |NULL      |
     3   | EMP1    | Ola        | Manager    | California |1/2/2016  |NULL      |
    

    You can continue loading your fact table with the "New" ID for the employee. In this case you will have 2 different sets of Keys for that employee.

    +-------+----------+
    |emp_id | Amount   | 
    | 1     | 2000000  |  
    | 2     | 300000   | 
    | 3     | 100000   |
    +----------+-------+
    

    If you want to rollup (say Sum of amounts) for an employee from the beginning, you would join the fact and dimension using the ID key and group by emp_key. So,

    select emp_key, sum(amount) from employee dim, account fact where dim.ID = fact.ID group by emp_key.
    

    If you want to find out the amount since he became a manager, you just have to do rollup on the ID field.

    select dim.ID, sum(amount) from employee dim, account fact where dim.ID = fact.ID group by dim.ID.
    

    or this way -

    select fact.ID, sum(amount) from account fact group by fact.ID.