Search code examples
performancessaspowerbidaxsummarize

How to write an optimized DAX Measure to aggregate a value by two group by attributes


What if we need to aggregate (Sum) of a value group by two attributes in DAX. I wrote the following measure with Summarize function but it is very slow.

Reorder :=
SUMX (
    SUMMARIZE (
        TableA,
        TableA[ProdID],
        TableA[CustID],
        "ReordersCount",
        VAR VarInvoiceCount =
            SUM ( TableA[InvoiceCount] )
        RETURN
            IF ( VarInvoiceCount > 0, VarInvoiceCount - 1, 0 )
    ),
    [ReordersCount]
) 

I also looked for SummarizeColumns but its not working in the report when I am applying other attributes slicers. May be I am missing something? Looking for optimized solution. Many thanks in advance.


Solution

  • Consider the following approach:

    First, create a measure for total number of invoices:

    Total Invoice Count = SUM(TableA[InvoiceCount])
    

    Second, create a measure to count a number of first-time invoices, which is simply a number of unique product-customer combinations in your table:

    First Invoice Count =
    COUNTROWS ( SUMMARIZE ( TableA, TableA[CustID], TableA[ProdID] ) )
    

    Finally, the desired result is simply the difference of these two measures:

    Reorder Count = [Total Invoice Count] - [First Invoice Count]
    

    The formula will respond properly to all slicers and filters, and should be very fast because there are no nested iteration loops such as SUMX(SUMMARIZE()), no context transitions and no call-backs inside the loops caused by using IF statements (that's a bit of an advanced topic).

    Of course, you can put everything in one measure using variables:

    Reorder Count = 
    VAR Total_Invoice_Count = SUM(TableA[InvoiceCount])
    VAR First_Invoice_Count = COUNTROWS ( SUMMARIZE ( TableA, TableA[CustID], TableA[ProdID] ) )
    VAR Reorder_Count = Total_Invoice_Count - First_Invoice_Count
    RETURN Reorder_Count
    

    although personally I prefer to break measures down because individual measures are easier to understand and debug, and they might have their own use.

    The above approach is very efficient, but it assumes that TableA contains only valid orders. If it also has cancellations, returns, etc., that might have zero or negative Invoice counts, then you will have to use a less efficient approach, such as:

    Reorder Count =
    SUMX (
        SUMMARIZE ( TableA, TableA[CustID], TableA[ProdID] ),
        VAR Reorder_Count = CALCULATE ( SUM ( TableA[Invoice] ) ) - 1
        RETURN
            IF ( Reorder_Count > 0, Reorder_Count, 0 )
    )
    

    or:

    Reorder Count =
    SUMX (
        SUMMARIZE ( TableA, TableA[CustID], TableA[ProdID] ),
        MAX(CALCULATE ( SUM ( TableA[Invoice] ) ) - 1, 0) )
    

    Nevertheless, they should be still faster than your original formula.