Surrogate and business keys are mentioned in wikipedia article about Data Vault.enter link description here. Retained keys are mentioned here.
According to wikipedia article it is not mandatory to specify the surrogate keys for the hub/satellite or link. However, in the link it is mandatory to specify the surrogate keys for the hub it connects. If it is not mandatory to specify the surrogate keys in hubs, how will I enter them in the link table?
Surrogate keys vs business keys. Surrogate keys are needed to connect the structures. Business keys are used to uniquely identify an item. But we could do it also via surrogate key, couldn't we? What is the main difference between the business key and a surrogate key and why cannot I just use a surrogate key as a businees key or vice versa?
In later article they write:
Hub table – contains business key and retained key. The retained key is a key which is mapped to business key one-to-one. In comparison, the surrogate key includes time and there can be many surrogate keys corresponding to one business key. This explains the name of the key, it is retained with insertion of a new version of a row while surrogate key is increasing. Why we need the retained key? There can be many sources, so we cannot use business key as it may have different type or different number of attributes in different source systems.
Because of many resources, there might be different business keys for the same item. But this specific item will be uniquely identified by a retained key. Did I understand it right? So, that mean that a surrogate key maps to different business keys and not one-to-one.
Example. Lets say we have a hub "Employee" with two different business keys coming from two different departments. There is an employee A. From Accounting department employee A will have a business key AT654 and from Project department he will have a business key PR454. The hub will have them two entries for this employer:
RetainedKey BK TimeStamp Source
AAA111 AT654 24.03.2012 Accounting
AAA111 PR454 25.03.2012 Project
But how can I incorporate a surrogate key? Namely, in which situation it will be increased (as stated in "This explains the name of the key,it is retained with insertion of a new version of a row whilesurrogate key is increasing") ?
Could somebody give me an example (situation) of usage of retained keys, business keys and surrogate keys?
There are three main terms in Data Vault 2.0:
If you need to describe the fact that you might have a business key which is different in several systems but means the same then one would use Same-As-Links tables for storage and lookup.
So to join your two systems you'll want to create a Same-As-Link structure, which would look like this:
Hash Key Accounting_Key Project_Key Timestamp
xyzabc... AT654 PR454 24.05.2012
Then this table can be used as a lookup table.