Search code examples
databasedata-warehouse

naming dimension tables in data warehouse


I have a question about naming convention for dimension tables in data warehouse. I wonder it is a good idea to keep the type of dimension in the name. for example for dimension type 1 we will have something like this:

  1. dim_scd1_student ( it is of slowly changing dimension type 1)
  2. dim_scd2_teacher ( it is of slowly changing dimension type 2)

Solution

  • No.

    Neither Dr. Ralph Kimball nor Christopher Adamson use this convention in their books. Both authors recommend this detail be documented in and managed from the ETL process.

    End users, such as managers and business analysts, will interact with table and column names. The best of them will know how to construct simple SQL queries to get the business answers they seek. But slowly changing dimensions are implementation details and not relevant to them. A column name "dim_scd2_..." will only create confusion, questions, and noise. Remember, this system is designed for business people, not software engineers.

    Also, I would omit the "dim" prefix. The business measurement or event tables should be suffixed with "fact" or "facts" (Kimball & Adamson use both). All other tables are by default dimensions.

    Aside: doing what you suggest is similar to embedding the storage type into a programming language variable name. For example, in a C program, the "int" type could be used as follows: int_customer_id. But what if in the future "int" isn't big enough and we need to use "long"? Oops.