Search code examples
powerbidaxpowerpivot

DAX: Calculate only when data is present for all years in dataset


I have a data set that contains sales forecast data by year over 5 years.

Each row has customer, item type, year, qty and sales price.

Not all customers buy all products in all years.

I want to get a list of all products that are purchased in all of the listed years.

An example, cut-down table looks like this:

Customer    Product Year    Qty Price
CustA       ProdA   2020    50  100
CustA       ProdA   2021    50  100
CustA       ProdA   2022    50  100
CustA       ProdB   2020    50  100
CustA       ProdB   2021    50  100
CustA       ProdC   2021    50  100
CustA       ProdC   2022    50  100
CustA       ProdD   2020    50  100
CustA       ProdD   2021    50  100
CustA       ProdD   2022    50  100
CustB       ProdA   2021    50  100
CustB       ProdA   2022    50  100
CustB       ProdC   2020    50  100
CustB       ProdC   2021    50  100
CustB       ProdC   2022    50  100
CustB       ProdD   2020    50  100
CustB       ProdD   2021    0   100
CustB       ProdD   2022    50  100

And transposed, looks like this:

Customer    Product     2020    2021    2022
CustA       ProdA       50      50      50
CustA       ProdB       50      50
CustA       ProdC               50      50
CustA       ProdD       50      50      50
CustB       ProdA               50      50
CustB       ProdC       50      50      50
CustB       ProdD       50      0       50

So, for this example, I'd want to do calculations on, or indicate rows that have a sales qty for all three years. I was trying to use the following formula which I would have compared with the max number of years in the set to mark a row as valid or not, but it's killing Excel. There are only 32,000 rows in the source table.

=CALCULATE(
      DISTINCTCOUNT(DataTable[Year]), 
      filter(DataTable, DataTable[Product] = EARLIER(DataTable[Product])), 
      filter(DataTable, DataTable[Customer] = EARLIER(DataTable[Customer])), 
      filter(DataTable, DataTable[Qty] > 0)
    )

Is there a better approach I could use for this?


Solution

  • How about this?

    ProductList =
    VAR AllYears = DISTINCTCOUNT ( 'DataTable'[Year] )
    VAR Summary =
        SUMMARIZE (
            'DataTable',
            'DataTable'[Product],
            "YearsPurchased", CALCULATE (
                DISTINCTCOUNT ( 'DataTable'[Year] ),
               'DataTable'[Qty] > 0
            )
        )
    RETURN
        SELECTCOLUMNS (
            FILTER ( Summary, [YearsPurchased] = AllYears ),
            "Product", [Product]
        )
    

    The Summary aggregates at the Product level and looks at how many distinct years it had with non-zero quantity. Then you just filter for the ones that match AllYears and take the Product column.


    Note that this returns a single column table and thus doesn't work as a calculated column or measure but a list is what you asked for.

    Edit: To get the YearsPurchased as a calculated column, you just need part of this:

    YearsPurchased =
    CALCULATE (
        DISTINCTCOUNT ( 'DataTable'[Year] ),
        FILTER ( ALLEXCEPT ( 'DataTable', 'DataTable'[Product] ), 'DataTable'[Qty] > 0 )
    )