Search code examples
ssisdimensional-modelingsql-data-warehouse

DWH Reload data


In monthly increment loaded DWH I have task to create process to be able reload random month in the DWH.

Lets say reload data for February 2021 in existing DWH.

If I reload data for February 2021 on 2021/08/15, my SCD2 dimension Customer will end up like this:

dim customer

I could have wrong dimension attributes until next load. And Dates in DateFrom/DateTo will be messed.

Questions:

  1. is it a good approach to reload single month?
  2. if yes, any advice how to deal with it?
  3. In this case I would prefer full reload of DWH. Is this good idea?

Working on sql server using SSIS ETL tool.

Thanks


Solution

  • If you are just running your existing process then in order to reload data you would need to rollback your DWH to the point prior to the incorrect data, apply the updated dataset again and then reapply all the subsequent datasets. Obviously, this is a significant piece of work so not a good idea unless you have no other choice and definitely not something you’d want to run regularly.

    If you do want to be able to re-apply a single dataset from the past then you’d need to write a process to do this e.g.

    • Identify the existing records that correspond to your updated dataset and delete them
    • Insert your updated dataset taking into account previous and subsequent records