Search code examples
sqlsql-serverssasmdxcross-join

MDX - Count of Filtered CROSSJOIN - Performance Issues


BACKGROUND: I've been using MDX for a bit but I am by no means an expert at it - looking for some performance help. I'm working on a set of "Number of Stores Authorized / In-Stock / Selling / Etc" calculated measures (MDX) in a SQL Server Analysis Services 2012 Cube. I had these calculations performing well originally, but discovered that they weren't aggregating across my product hierarchy the way I needed them to. The two hierarchies predominantly used in this report are Business -> Item and Division -> Store.

For example, in the original MDX calcs the Stores In-Stock measure would perform correctly at the "Item" level but wouldn't roll up a proper sum to the "Business" level above it. At the business level, we want to see the total number of store/product combinations in-stock, not a distinct or MAX value as it appeared to do originally.

ORIGINAL QUERY RESULTS: Here's an example of it NOT working correctly (imagine this is an Excel Pivot Table):

[FILTER: CURRENT WEEK DAYS]
[BUSINESS]          [AUTH. STORES]  [STORES IN-STOCK]   [% OF STORES IN STOCK]
[+] Business One    2,416           2,392               99.01%
[-] Business Two    2,377           2,108               93.39%
    -Item 1         2,242           2,094               99.43%
    -Item 2         2,234           1,878               84.06%
    -Item 3         2,377           2,108               88.68%
    -Item N         ...             ...                 ...

FIXED QUERY RESULTS: After much trial and error, I switched to using a filtered count of a CROSSJOIN() of the two hierarchies using the DESCENDANTS() function, which yielded the correct numbers (below):

[FILTER: CURRENT WEEK DAYS]
[BUSINESS]          [AUTH. STORES]  [STORES IN-STOCK]   [% OF STORES IN STOCK]
[+] Business One    215,644         149,301             93.90%
[-] Business Two    86,898          55,532              83.02%
    -Item 1         2,242           2,094               99.43%
    -Item 2         2,234           1,878               99.31%
    -Item 3         2,377           2,108               99.11%
    -Item N         ...             ...                 ...

QUERY THAT NEEDS HELP: Here is the "new" query that yields the results above:

CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
AS COUNT(
    FILTER(
        CROSSJOIN(
            DESCENDANTS(
                [Product].[Item].CURRENTMEMBER,
                [Product].[Item].[UPC]        
            ),
            DESCENDANTS(
                [Division].[Store].CURRENTMEMBER,
                [Division].[Store].[Store ID]       
            )
        ),
        [Measures].[Inventory Qty] > 0
    )
), 
FORMAT_STRING = "#,#", 
NON_EMPTY_BEHAVIOR = { [Inventory Qty] }, 

This query syntax is used in a bunch of other "Number of Stores Selling / Out of Stock / Etc."-type calculated measures in the cube, with only a variation to the [Inventory Qty] condition at the bottom or by chaining additional conditions.

In its current condition, this query can take 2-3 minutes to run which is way too long for the audience of this reporting. Can anyone think of a way to reduce the query load or help me rewrite this to be more efficient?

Thank you!


UPDATE 2/24/2014: We solved this issue by bypassing a lot of the MDX involved and adding flag values to our named query in the DSV.

For example, instead of doing a filter command in the MDX code for "number of stores selling" - we simply added this to the fact table named query...

CASE WHEN [Sales Qty] > 0 
    THEN 1 
    ELSE NULL 
END AS [Flag_Selling]

...then we simply aggregated these measures as LastNonEmpty in the cube. They roll up much faster than the full-on MDX queries.


Solution

  • It should be much faster to model your conditions into the cube, avoiding the slow Filter function:

    If there are just a handful of conditions, add an attribute for each of them with two values, one for condition fulfilled, say "cond: yes", and one for condition not fulfilled, say "cond: no". You can define this in a view on the physical fact table, or in the DSV, or you can model it physically. These attributes can be added to the fact table directly, defining a dimension on the same table, or more cleanly as a separate dimension table referenced from the fact table. Then define your measure as

    CREATE MEMBER CURRENTCUBE.[Measures].[Num Stores In-Stock]
    AS COUNT(
        CROSSJOIN(
            DESCENDANTS(
                [Product].[Item].CURRENTMEMBER,
                [Product].[Item].[UPC]        
            ),
            DESCENDANTS(
                [Division].[Store].CURRENTMEMBER,
                [Division].[Store].[Store ID]       
            ),
            { [Flag dim].[cond].[cond: yes] }
    
        )
    )
    

    Possibly, you even could define the measure as a standard count measure of the fact table.

    In case there are many conditions, it might make sense to add just a single attribute with one value for each condition as a many-to-many relationship. This will be slightly slower, but still faster than the Filter call.