Search code examples
t-sqlssrs-2014

Performing Count() and Avg() within a Case statement


I am using this CASE statement to show a total count of visits and average length of visit grouped by age groups:

    CASE WHEN AgeCalcSort = 0 AND  AgeCalcSort <= 1 then (
        Count(VisitID),
        AVG(DATEDIFF(dd,StartDate,EndDate ))
        )
    WHEN AgeCalcSort >= 2 AND  AgeCalcSort <= 17 then (
                Count(VisitID),
        AVG(DATEDIFF(dd,StartDate,EndDate ))
        )
    WHEN AgeCalcSort >= 18 AND  AgeCalcSort <= 64 then (
                Count(VisitID),
        AVG(DATEDIFF(dd,StartDate,EndDate ))
        )
    WHEN AgeCalcSort >= 65 then (
                Count(VisitID),
        AVG(DATEDIFF(dd,StartDate,EndDate ))
        )
    END,

The result should look like this:

Age 1 Count  Age 1 Avg LOS      Age 2 Count  Age 1 Avg LOS      Age 3 Count  Age 3 Avg LOS      Age 4 Count     Age 4 Avg LOS
5           5.3                 18          9.2                 20              12              0               0   

Can anyone adivse what am I doing wrong, or a better way to achieve the end result? Thanks In advance.


Solution

  • You have mentioned 2 points:

    1) Can anyone adivse what am I doing wrong

    • You are using the CASE expression in your query incorrectly.
    • You cannot have 2 columns returned when some condition is satisfied in CASE statement.
    • As per Microsoft documentation, CASE evaluates a list of conditions and returns one of multiple possible result expressions.
    • Check this Microsoft white paper for further study Ref. URL

    2) a better way to achieve the end result

    • There can be many ways to achieve this.
    • Below is one of the simple way. try the query below:

      SELECT
          SUM(CASE WHEN AgeCalcSort = 0 AND AgeCalcSort <= 1 THEN 1 ELSE 0 END) [Age 1 Count], 
          AVG(CASE WHEN AgeCalcSort = 0 AND AgeCalcSort <= 1 THEN DATEDIFF(DAY,StartDate,EndDate) ELSE NULL END) [Age 1 Avg LOS]
          SUM(CASE WHEN AgeCalcSort = 2 AND AgeCalcSort <= 17 THEN 1 ELSE 0 END) [Age 2 Count], 
          AVG(CASE WHEN AgeCalcSort = 2 AND AgeCalcSort <= 17 THEN DATEDIFF(DAY,StartDate,EndDate) ELSE NULL END) [Age 2 Avg LOS]
          SUM(CASE WHEN AgeCalcSort = 18 AND AgeCalcSort <= 64 THEN 1 ELSE 0 END) [Age 3 Count], 
          AVG(CASE WHEN AgeCalcSort = 18 AND AgeCalcSort <= 64 THEN DATEDIFF(DAY,StartDate,EndDate) ELSE NULL END) [Age 3 Avg LOS]
          SUM(CASE WHEN AgeCalcSort >= 65 THEN 1 ELSE 0 END) [Age 4 Count], 
          AVG(CASE WHEN AgeCalcSort >= 65 THEN DATEDIFF(DAY,StartDate,EndDate) ELSE NULL END) [Age 4 Avg LOS]
      FROM [YourTableName]
      

    Query explanation

    • I have achieved count by using SUM, if you look into any of count logic you will notice it says when it satisfies condition 1 else 0. and that is summed later using SUM. So that all 1's ultimately gives count.
    • for average you will notice that then condition is not satisfied I am using NULL, which is intentional, to avoid wrong math.
    • for in TSQL query AVG of (2,4,NULL) is 3, while AVG of (2,4,0) is 2. so in your case NULL will help to avoid messing average value.