I have a general question about star model in business intelligence project. For example, let's suppose that the project is comprised of one fact table (F) and 3 dimensions (D1, D2, D3). Furthermore, let's assume that fact table looks like this:
d11 d21 d21 m11 m21 m21
d12 d22 d22 m12 m22 m22
d13 d23 d23 m13 m23 m23
d14 d24 d24 m14 m24 m24
d15 d25 d25 m15 m25 m25
d16 d26 d26 m16 m26 m26
d17 d27 d27 m17 m27 m27
For example - d23 = dimension nr 2, value nr 3 in the dimension (same is for measures)
Now, let's assume that a selection is made on every of 3 dimensions and that following parts of the fact table are selected:
d11 d21 d21 m11 m21
D12 d22 D22 m12 m22
D13 D23 D23 m13 m23
D14 D24 D24 m14 m24
d15 D25 D25 m15 m25
d16 d26 D26 m16 m26
d17 d27 d27 m17 m27
Now I would like to know which selections (marked with uppercase 'D') will/should/need to be in star model considered? If OUTER JOIN principle is applied, then the following will be selected:
D12 d22 D22 m12 m22
D13 D23 D23 m13 m23
D14 D24 D24 m14 m24
d15 D25 D25 m15 m25
d16 d26 D26 m16 m26
ie for the selection in the first meassure following values will be considered (m12, m13, m14, m15, m16) and for second measure (m22, m23, m24, m25, m26).
On the other hand, if INNER JOIN is between fact table and dimension tables, result will be following selection:
D13 D23 D23 m13 m23
D14 D24 D24 m14 m24
ie for first measure following values would be considered (m13, m14) and for the second measure (m23, m24) in corresponding aggregate functions.
Which of the following approaches is taking place?
I cannot say that for all systems, but the standard way of selection is that you do selections on the dimension tables, which are then INNER JOIN
ed to the fact table to filter the data, and INNER JOIN
ed back to the dimensions where you have no selections but where columns are shown in the result.