Search code examples
sqlsql-serverdata-partitioningstdevp

STDEVP for calculated fields


I have a table that looks like this:

ID       CHANNEL    VENDOR  num_PERIOD  SALES.A SALES.B
000001  Business    Shop    1            40       30
000001  Business    Shop    2            60       20
000001  Business    Shop    3            NULL     30

With many combinations of ID, CHANNEL and VENDOR, and sales records for each of them over time (num_PERIOD).

I want to get the average Standard Deviation of a new field, which returns the sum of SALES.A + SALES.B sum(IS.NULL(SALES.A,0) + ISNULL(SALES.B,0)).

The problem I have is that STDEVP seem to fail with calculated fields, and the result that returns is invalid.

I have been trying with:

select ID, CHANNEL, VENDOR, stdevp(sum(isnull(SALES.A,0) + ISNULL(QSALES.B,0))) OVER (PARTITION BY  ID, CHANNEL, VENDOR) as STDEV_SALES
      FROM TABLE
GROUP BY ID, CHANNEL, VENDOR

However, the results I'm obtaning are always 0 or NULL.

What I want to obtain is the Average Standard Deviation of each ID, CHANNEL and VENDOR over time (num_PERIOD).

Can someone find an approximation for this please?


Solution

  • Your query doesn't match the sample data.

    I can see the problem, though. The SUM() are calculating a single value for each group, and then you are taking the standard deviation of that value. Because you cannot nest aggregation functions, you have turned it into a window function.

    Get rid of the sum(). The following should work in SQL Server:

    SELECT ID, CHANNEL, VENDOR, 
           STDEVP(COALESCE(SALES.A, 0) + COALESCE(QSALES.B, 0))  as STDEV_SALES
    FROM SALES . . .
         QSALES
    GROUP BY ID, CHANNEL, VENDOR;
    

    I would also return the COUNT(*) . . . the standard deviation doesn't make sense if you have fewer than 3 rows. (Okay, it is defined for two values, but not very useful.)