Search code examples
sql-serverdatabase-designdatamart

Best way to build a DataMart from multiple external systems?


I'm in the planning stages of building a SQL Server DataMart for mail/email/SMS contact info and history. Each piece of data is located in a different external system. Because of this, email addresses do not have account numbers and SMS phone numbers do not have email addresses, etc. In other words, there isn't a shared primary key. Some data overlaps, but there isn't much I can do except keep the most complete version when duplicates arise.

Is there a best practice for building a DataMart with this data? Would it be an acceptable practice to create a key table with a column for each external key? Then, a unique primary ID can be assigned to tie this to other DataMart tables.

Looking for ideas/suggestions on approaches I may not have yet thought of.

Thanks.


Solution

  • The email address or phone number itself sounds like a suitable business key. Typically a "staging" database is used to load the data from multiple sources and then assign surrogate keys and do other transformations.

    Are you familiar with data warehouse methods and design patterns? If you don't have previous knowledge or experience then consider hiring some help. BI / data warehouse projects have a very high failure rate and mistakes can be expensive.