Search code examples
data-warehousedimensionsstar-schemasurrogate-key

How to populate fact table with Surrogate keys from dimensions


Could you please help understand how to populate fact table with Surrogate keys from dimensions.

I have the following fact table and dimensions:

ClaimFacts

ContractDim_SK ClaimDim_SK AccountingDim_SK ClaimNbr ClaimAmount

ContractDim

ContractDim_SK (PK) ContractNbr(BK) ReportingPeriod(BK) Code Name

AccountingDim

TransactionNbr(BK) ReportingPeriod(PK) TransactionCode CurrencyCode (Should I add ContractNbr here ?? original table in OLTP has it)

ClaimDim

CalimsDim_Sk(PK) CalimNbr (BK) ReportingPeriod(BK) ClaimDesc ClaimName (Should I add ContractNbr here ?? original table in OLTP has it)

My logic to load data into fact table is the following :

  1. First I load data into dimensions (with Surrogate keys are created as identity columns)
  2. From transactional model (OLTP) the fact table will be filled with the measures (ClaimNbr And ClaimAmount)

  3. I don’t know how to populate fact table with SKs of Dimensions, how to know where to put the key I am pulling from dimensions to which row in fact table (which key belongs to this claimNBR ?) Should I add contract Nbr in all dimensions and join them together when loading keys to fact?

What’s the right approach to do this? Please help, Thank you


Solution

  • The way it usually works:

    1. In your dimensions, you will have "Natural Keys" (aka "Business Keys") - keys that come from external systems. For example, Contract Number. Then you create synthetic (surrogat) keys for the table.
    2. In your fact table, all keys initially must also be "Natural Keys". For example, Contract Number. Such keys must exist for each dimension that you want to connect to the fact table. Sometimes, a dimension might need several natural keys (collectively, they represent dimension table "Granularity" level). For example, Location might need State and City keys if modeled on State-City level.
    3. Join your dim table to the fact table on natural keys, and from the result omit natural key from fact and select surrogat key from dim. I usually do a left join (fact left join dim), to control records that don't match. I also join dims one by one (to better control what's happening).

    Basic example (using T-SQL). Let's say you have the following 2 tables:

    Table Source.Sales
    (   Contract_BK, 
        Amount, 
        Quantity)
    
    Table Dim.Contract
    (   Contract_SK,
        Contract_BK,
        Contract Type)
    

    To Swap keys:

    SELECT
         c.Contract_SK
        ,s.Amount
        ,s.Quantity
    INTO
        Fact.Sales
    FROM
        Source.Sales s LEFT JOIN Dim.Contract c ON s.Contract_BK = c.Contract_BK
        
    -- Test for missing keys
    SELECT 
        * 
    FROM 
        Fact.Sale 
    WHERE 
        Contract_SK IS NULL