Search code examples
sql-serverdata-warehouse

Multiple customer dimension


I am new to dwh and creating data warehouse for my company. We have a customer dimensions which we want to create.However finance team customer list Id and name are different from the CRM list ID and name. We created a mapping table but sometimes for same finance ID we have multiple crmid How should I handle this...do I have to create two customer dimension

   CRM        Name
  CRMID1     XXXX
  CRMID2     YYYYY
  CRMID3     ZZZ

Mapping table

   CRMID    FINID
   CRMID1   FINID1
   CRMID2   FINID1
   CRMID3   FINID2
   CRMID4   FINID3

FinTable

   FINID1   XXXX
   FINID2   YYYY
   FINID3   ZZZ

When I create the Dimension and join these three tables

Final Dimension

  ClientKey CRMID   FINID
    1      CRMID1   FINID1
    2      CRMID2   FINID1
    3      CRMID3   FINID2
    4      CRMID4   FINID3

This causes duplication when I join the dimension table with Fact table and add the ClientKey in the Fact table.

Should I create Two Dimension for Client or Is their any other way I can handle it in one dimension.


Solution

  • As a Data Warehouse you must have a unique customer ID.

    A solution involving your operational systems is best (i.e - Master Data Management) but if this is too hard to drive (and it usually is) at least create a unique customer ID in the DWH Customer dimension.

    Both the CRMID and the FINID are attributes of a customer in the Customer dimension regardless of the current relation between them (1:1 or 1:M or even M:M).

    As mmarie mentioned your unique customer ID can be a combination of business keys but I strongly advise creating a separate surrogate key since as Ralph Kimball states production keys are unstable\volatile.

    Another reason is that your source systems change over time (like you mentioned) even when your customers don’t so you need the ability to track and report on a customer over multiple systems and periods.

    Assuming you will go for a SCD type 2 – you will want something like this:

    • PK (Surrogate, time variant)
    • CustomerID (Surrogate, permanent, DWH generated, unique per customer)
    • CRMID (CRM business key)
    • FINID (business key)
    • ATTRIBUTE_1
    • ATTRIBUTE_2
    • START_DATE
    • END_DATE