Search code examples
ssasbusiness-intelligencecube

What is the maximum SSAS Cube Capacity


Apologies if I've posted this in the wrong place. I appreciate this is a very hard question to answer as there are too many vairables but any advice or pointers would be very much appreciated.

We have an incredibly large, unwieldy, badly designed cube. It's that horrible 'one cube to rule them all' type as demonstrated below. Please note that Dimensions with names which may betray the place I am working for etc have been obfuscated.

What I am trying to get a feel for is how much data a cube can hold, as a generic rule of thumb. I (and several experts, of which I do not claim to be one!) have stated to management that if they continue to add data (and attirbutes) to the cube at the current level, it will fail. What we'd like is a way to work out whether this will be this year, next year, this month etc...and yes, I know this isn't going to have an exact formula answer. Any guidelines would be very helpful as I can't find any online; only best practise for build and I already know that this doesn't conform to that! I'm trying to get a budget approval to redesign it, hence the question...

23 dimensions, No KPIs, 4 calculated measures and 46 other measures

[Dim 1] - 11 attributes
    no hierarchies
    4 address lines, email address, full name, postcode, text provider type
[Area Detail] - 21 attributes
    no hierarchies
    2 address lines, postcode, various name and code fields (string)
[Area Main 1 Month Prior] - 5 attributes
    2 hierarchies
[Area Main 4 Months Prior] - 5 attributes
    2 hierarchies
[Area Main Dimension] - 5 attributes
    2 hierarchies
[Type Dim 1] - 1 attributes
    no hierarchies
[Date Dimension] - 36 attributes
    4 hierarchies
[Event Dimension] - 29 attributes
    no hierarchies
    includes 5 dates which are not linked to date dimension but actually entered
[Event Rank Dimension] - 18 attributes
    no hierarchies
[Event Track Dimension] - 21 attributes
    no hierarchies
    14 date fields
    7 comment fields (freetext)
[History Date Dimension] - 4 attributes
    no hierarchies
    all date data
[Dim 2] - 5 attributes
    no hierarchies
    all freetext fields apart from code
[Official Date Dimension] - 9 attributes
    no hierarchies
    Date field and data about the date
[Previous Dim 2 Dimension] - 4 attributes
    no hierarchies
[xxx Current Record Dimension] - 1 attribute
    no hierarchies
[xxx Dimension] - 102 attributes
    no hierarchies
    4 address fields, postcode, 2 email fields, website
[xxx Dimension 1 Month Prior] - as above
[xxx Dimension 4 Months Prior] - as above
[Dim 3] - 12 attributes
    no hierarchies
[Question Dimension] - 11 attributes
    1 hierarchy
    4 large text fields
[yyy Combination Dimension] - 1 attribute
    no hierarchies
[yyy Current Record Dimension] - 1 attribute
    no hierarchies
[yyy Status Dimension] - 3 attributes
    no hierarchies
[Response Dimension] - 4 attributes
    no hierarchies
    2 large text fields
[zzz Area Dimension] - 4 attributes
    no hierarchies
    2 text fields
[zzz Type Dimension] - 1 attribute
    no hierarchies

I hope this makes sense but happy to provide/clarify detail.


Solution

  • From my experience, the metrics you posted are mainly relevant to usability - adding more dimensions and measures will not cause your cube to "fail". I have successful stable cubes with many more dimensions and measures e.g. double or more.

    The "one cube to rule them all" is an architectural advance introduced in SQL 2005. It optimises the build time, storage and query performance. With SQL Enterprise Edition you can present slices of it as "Perspectives", but I'm not a fan. I prefer to follow carefully planned Dimension and Measure naming as most client tools sort those objects alphabetically.

    What can cause your cube to struggle and perhaps eventually "fail" is the volume of data in your larger dimensions and measure groups. Dimensions under 1m rows are normally no drama. Measure Groups under 100m rows are also usually fine with some basic Aggregations. Bigger than that and you may need to put more work into the design. I aim for sub 5 second response times for 95% of queries with simple end-user tools e.g. Excel 2010+.