Search code examples
mdxiccube

Is there a healthy way to bridge a link between a fact-dimension, where their relationship is only partial?


I'm trying to get a Dimension that only has an incomplete relationship to the measure in question. That if we had two tables A and B with their #'d columns, the set from column A.1 would have column members not present in B.1's set.

In an example scenario, we have three tables:

Items: Supplied by sql view: T_Items

Item ID Item Normal Price
1 Apple $1.99
2 Banana $2.99
3 Grapes $3.99
4 Ham $4.99
5 Beef $5.99

Items on Sale: Supplied by sql view: T_Items_On_Sale

Item on Sale ID Item ID Sale Price Sale Period
1 1 $1.00 1/1 - 1/10
2 1 $1.00 1/21 - 1/31
3 2 $2.00 1/11 - 1/20
4 3 $3.00 1/21 - 1/30
5 2 $1.00 1/1 - 1/10

Sales: Supplied by sql view: T_Sales

Transaction ID Item ID Date Price Quantity
1 1 1/2 1.00 2
2 4 1/23 4.99 1
3 3 1/22 3.00 2
4 1 1/15 1.99 1
5 2 1/4 1.00 1
6 5 1/17 5.99 1

From that above data, I'm trying to arrange something like the following:

Dimension: Key Column: Supplying Datatable:
Items_on_sale Item on Sale ID T_Items_On_Sale
Measure: Supplying Datatable:
Sales T_Sales

Where I make a bridge definitions like the following:

Fact Join Column: Bridge Table: Join Column: Join Column (Dimension table):
Sales.ItemId T_Items_On_Sale T_Items_On_Sale.ItemId Items_on_Sale.ItemId

Items_on_sale's dimension entries would like to relate to the Sales measure based on the Item ID.

  • But Item ID isn't the key used in the Items_on_Sale dimension
  • Their relationship is also not complete.
    • There are Sales made for items that have never been on discount (Sales entries for items 4 and 5) and so won't fill out the tuple of this relationship

In my real scenario, the measure I'm trying to relate to this partially relevant dimension will end up leaving off every single incomplete record, and drastically change the outcome. Is there a healthier way to link such dimensions to these measures?

In my own testing, I've tried:

  • Messing with each combination of the bridge definition:
    • one-to-one select on both sides

    • Whether to relate the "On sale" dimension side to join to its dimension based on the Item ID or the On Sale entry ID.

  • Increasing the amount of allowed max unresolved rows to allow loads to still legally complete. But again, this just means the final product is missing data.

Solution

  • Thanks for thoroughly explaining the problem. As you point out, bridge tables filter the fact table for only the IDs that are in the bridge table (like an SQL inner join).

    To link the T_Items_On_Sale dimension and keep all information from the fact table, you could either:

    • extend the T_Items_On_Sale with all the item IDs that are not present using an SQL union. For the items without any sale, the price and period columns would be null.
    • create a new fact table that is a left-join of T_Sales and T_Items_On_Sale on Item ID and Date between Sale Period. Now you can individually link the T_Items_On_Sale and T_Items as dimensions to this table.