Search code examples
attributesdimensionsstar-schema

How is data in dimensional tables and fact tables of a data warehouse star schema?


I am studying data warehousing star schema and attribute hierarchies and I am getting confused because the examples of the book do not provide sample data on which to confirm my understanding of things.

In the book there is a sales data warehouse with a dimension for products that has the following attribute hierarchy: PRODUCT(AllProducts, ByProductType, OneProduct)

See the image below: enter image description here

What kind of sample data would go into the product dimension table?

PRODUCTS


AllProducts ByProductType OneProduct


?                 ?                       ?

?                 ?                       ?

?                 ?                       ?

If I'm not mistaken attributes are table columns or fields and I can't understand how would the data look like in this table

If you take the dimension Time from the same image, data can easily be sampled:

TIME


Year Quarter Month Week


2010 1           1        1

2010 1           1        2

2010 1           1        3

2010 1           1        4

One other thing I am confused about is how the data looks like in the fact table. In the book it is stated that the fact table would deliberately contain duplicated data. For example to come up with the sales figures of a whole year there is no need to aggregate weekly sales values on the fly because they would have already been calculated. Now if in the time dimension I have a primary key value 1 that represents a year a quarter a month and a week all at once how can I hold the aggregate value of 1 whole year in the fact table?

TIME


ID Year Quarter Month Week


1  2010 1           1        1



SALES_FACT_TABLE


Sales TimeID ProductID


1000  1          ?


Solution

  • The greyish fields (AllProducts, OneProduct etc) in the figure are examples of queries on the products dimension, not attributes.

    Attributes would be Product.Name, Brand, SKU, InternalName, DevelopmentStartDate, OnMarketDate ... whatever