Search code examples
data-warehouseetl

Separating Inserts and Updates while loading fact tables


I am trying to understand fundamentals of data warehousing. While loading fact tables, i found two recommendations.

  1. Separate inserts, updates during load
  2. Drop Indexes and build them after load.

What is the advantage of following them?


Solution

  • Simple answers without going into details:

    1. Typically you want to different things with new data (inserts) or changed data (update or insert depending on how you treat changes)

    2. An index needs to be recreated when you alter a table resulting in a lot of slow index recreations after each row of data is loaded. When you load a lot of data (which is a typical scenario in a data warehouse) you significantly slow down the loading process for no good reason. Therefore it is strongly advised to drop the indexes before you load a lot of data and recreate them only once when done loading