Search code examples
pentahodata-warehousedimensional-modelingpentaho-cde

Implementing SCD Type 2 using Pentaho Kettle (Pentaho Data Integeration 5.2)


I am having a table, plan, with columns p_id,p_name,start_date,end_date,last_updated

Problem Statement: when a customer changes from plan A to plan B, its end_date corresponding to plan A gets updated in the table and at the same time a new record for plan B inserted into the table.

I am creating a data warehouse in which "plan" is one of my dimension. Now, for customer A, i want to maintain his plan history in the dimension table. I know, we can solve this problem using SCD Type 2 dimension table. I am using Pentaho Data Integeration tool 5.2. I am trying to solve this problem using "dimension lookup/update" feature available in the tool. But the problem is that I want to insert values for "start_date" and "end_date" from my plan table, not what is set in the "dimension lookup/update". So whenever, a customer changes its plan, its end_date should be updated in the dimension table and at the same time a new record for plan B should be inserted in the table for the same customer.

Can anyone suggest the best approach to solve this problem i.e. values for start_date and end_date should get updated in the dimension table which are coming from my plan table not what pentaho kettle degenerates?


Solution

  • You could use the UPDATE step. This will give you the flexibility of updating any columns with the values/fields you like.