Search code examples
sqlsql-server-2008reporting-servicesstatisticsssrs-2008

Standard deviation within group in SQL or SSRS (alternatively - calculating Sigma or Cpk)


I have a serious problem with calculating Standard Deviation within subgroup in SSRS/SQl (whichever is more convenient). The real reason behind this is I need to calculate Sigma value (according to Six Sigma principles) or alternatively Cpk value (process capability), but my efforts stop at the standard deviation.

I'm not strong with statistics but it seems that StDev and StDevP functions in SSRS (and SQL) are not calculated 'within subgroup', and I cannot find a function that could do that. Below is an excel screen shot showing the calculation steps which I would like to achieve in SQL/SSRS Edit: Cannot post images yet, please find below

enter image description here

The tricky part is calculating the sum of absolute differences between each pair of values. The order in which the values are given is significant. I forgot to mention that number 27 in denominator of Rbar (Row 7) is the sample size minus 1 (27). Below is also a table in SQL:

DECLARE @Measurements TABLE(Val FLOAT)
INSERT INTO @Measurements (Val)(
    SELECT 485
    UNION ALL SELECT 490.6
    UNION ALL SELECT 490.6
    UNION ALL SELECT 485
    UNION ALL SELECT 485
    UNION ALL SELECT 489
    UNION ALL SELECT 485
    UNION ALL SELECT 477
    UNION ALL SELECT 477
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 485
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 485
    UNION ALL SELECT 485
    UNION ALL SELECT 477
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 484.6
    UNION ALL SELECT 482
    UNION ALL SELECT 482
    UNION ALL SELECT 482
)
SELECT STDEV(Val) FROM @Measurements 

I'm guessing that what I need could be achieved by using either RollingValue or some stored procedure in SSRS. Of course if there is a smarter way to calculate Sigma or Cpk I'll be very happy to hear about it.

I hope the question is well formulated, if not please comment. This is very important for me so I will be grateful for any assistance :) Thanks!


Solution

  • I think this should do the trick? Not sure how the denominator of 27 was calculated or the d2 value, so have hard coded them in for now...

    In terms of using this query it is important to preserve the row order, so I have whacked the values into a temp table using an identity column - if your values are coming from a table and there is a fixed criteria for ordering, use row_number() OVER (order by criteria_column) instead

    IF OBJECT_ID('tempdb..#values') IS NOT NULL DROP TABLE #values
    CREATE TABLE #values (row INT IDENTITY (1,1),val FLOAT)
    INSERT INTO #values
        SELECT 485 as val
        UNION ALL SELECT 490.6 
        UNION ALL SELECT 490.6 
        UNION ALL SELECT 485 
        UNION ALL SELECT 485 
        UNION ALL SELECT 489 
        UNION ALL SELECT 485 
        UNION ALL SELECT 477 
        UNION ALL SELECT 477 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 485 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 485 
        UNION ALL SELECT 485 
        UNION ALL SELECT 477 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 484.6 
        UNION ALL SELECT 482 
        UNION ALL SELECT 482 
        UNION ALL SELECT 482 
    
    
    
    ;with stdevs AS
    (
    SELECT 
    v1.val
    ,ABS(v1.val - v2.val) as abs_diff
    ,v2.val as value2
    from #values v1
    LEFT OUTER JOIN #values v2
    ON v2.row = v1.row + 1
    )
    SELECT
    avg(val) as average_value
    ,sum(abs_diff) as abs_sum
    ,sum(abs_diff) / 27 as Rbar
    ,1.13 as d2
    ,(sum(abs_diff) / 27) / 1.13 as std_dev
    FROM stdevs