Search code examples
data-warehousemeasurefact

Data warehouse: difference between semi-additive and non-additive measures


I have some difficulty to understand the difference between semi-additive and non-additive measures in a fact table. I've seen this example:

What's the difference between additive, semi-additive, and non-additive measures

But I don't understand it. I tried to read some of the Kimball's books but I don't understand it. In theory, you cannot sum over some of the dimensions but the examples that I see on the Internet seems to be similar than non-additive measures, what's the difference between them.

I need a good explanation with examples because I have an exam soon and I need to understand this :).

Thank you.


Solution

  • Just to be clear, when describing a measure as being semi- or non-additive we are talking about whether the operation makes logical/business sense. We are not saying that you cannot perform any/certain mathematical operations on these measures, just that if you do the result you would get would have no business meaning.

    Semi-additive Measures

    Say you have a fact table like this, showing monthly bank balances for customers :

    # Customer_id Month Balance
    1 AAA 2022-01 100.00
    2 AAA 2022-02 200.00
    3 AAA 2022-03 90.00
    4 AAA 2022-04 750.00
    5 AAA 2022-05 400.00
    6 BBB 2022-01 250.00
    7 BBB 2022-02 68.00
    8 BBB 2022-03 170.00
    9 BBB 2022-04 98.00
    10 BBB 2022-05 230.00

    The balance is additive across customers e.g. the total customer balance for 2022-01 was 350.

    The balance is not additive across months e.g. saying the balance for customer A between 2022-01 and 2022-02 was 100+200=300 is meaningless. At no point was Customer A's month-end balance 300.

    So because this balance measure is additive for some dimensions (Customer) but not for others (Month) it is described as semi-additive.

    Non-Additive

    Say we have a fact table that shows the ratio of the month-end balance to the balance at the end of 2021 (say for Customer A the 2021 balance was 1000 and for Customer B is was 500)

    # Customer_id Month Balance Ratio
    1 AAA 2022-01 100.00 0.1
    2 AAA 2022-02 200.00 0.2
    3 AAA 2022-03 90.00 0.09
    4 AAA 2022-04 750.00 0.75
    5 AAA 2022-05 400.00 0.4
    6 BBB 2022-01 250.00 0.5
    7 BBB 2022-02 68.00 0.17
    8 BBB 2022-03 170.00 0.34
    9 BBB 2022-04 98.00 0.196
    10 BBB 2022-05 230.00 0.46

    There are no aggregations you could make to the ratio that make any sense e.g. summing or averaging the ratio by customer or month would produce a meaningless figure. Therefore the ratio is a non-additive measure