Search code examples
sqlperformanceforeign-keysdata-warehouseparallel-data-warehouse

Is there any performance impact on having NULLs on Foreign key column in a Data mart


We are currently working on Data mart design. We are having many Foreign keys to dimension tables. We are thinking whether to allow NULL in Foreign key dimension fields or have -1 to represent NULL values.

Kimball suggests to keep default row for NULL values. http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/fact-table-null/

My lead suggests to keep NULL as NULL.

Will there be any performance impact for keeping NULL in Foreign key fields ?


Solution

  • Kimball is right (as he usually is). Use a default value where you would use NULL.

    Why? It ensures that joins to the dimensions will not "accidentally" filter rows. Trying to reconcile results from different queries eats up a lot of time. Ensuring that joins succeed is one method of reducing such discrepancies.

    If you are not going to follow his advice, then store using NULL. A value such as -1 is particularly bad -- because it prevents the database from enforcing foreign key constraints.