Search code examples
sqlsql-servercountsummultiple-conditions

Several COUNTS on single SQL Query: Looking for efficiency


I would just like your oppinion on the efficiency of a query. I am counting values through a very large table (several million records). This is MSSQL but I think it should apply to any database engine. What I am doing now is the following:

SELECT Task,
    COUNT(*) as Total,
    SUM(CASE WHEN Status = 'Active' THEN 1 ELSE 0 END) AS Active,
    SUM(CASE WHEN Status = 'Active' AND AppType = 'MOBILE' THEN 1 ELSE 0 END) AS ActiveMobile,
    SUM(CASE WHEN Status = 'Active' AND AppType = 'WEB' THEN 1 ELSE 0 END) AS ActiveWeb,
    SUM(CASE WHEN Status = 'OnHold' THEN 1 ELSE 0 END) AS onHold,
    SUM(CASE WHEN Status = 'onHold' AND AppType = 'MOBILE' THEN 1 ELSE 0 END) AS onHoldMobile,
    SUM(CASE WHEN Status = 'onHold' AND AppType = 'WEB' THEN 1 ELSE 0 END) AS onHoldWeb,
    SUM(CASE WHEN Status = 'Active' OR Status = 'onHold' THEN 1 ELSE 0 END) AS ActiveAndOnHold,
    SUM(CASE WHEN (Status = 'Active' OR Status = 'onHold') AND AppType = 'MOBILE' THEN 1 ELSE 0 END) AS ActiveAndOnHoldMobile,
    SUM(CASE WHEN (Status = 'Active' OR Status = 'onHold') AND AppType = 'WEB' THEN 1 ELSE 0 END) AS ActiveAndOnHoldWeb
FROM events
GROUP BY Task;

I realize I am counting the same thing over again and I should be able to add partial results but, to be honest, I could not figure out how to do it without traversing the table more than once.

The actual query has about 20 more SUMs with combinations of the same data. The query takes a while to run (about two hours). I am wondering if there is a better way to do this.

Any suggestion is very welcome.

Thanks


Solution

  • You can keep intermediate flags. I doubt this will speed your query, but it will make it simpler to maintain:

    SELECT Task, COUNT(*) as Total,
            SUM(is_active) AS Active,
            SUM(is_active * is_mobile) AS ActiveMobile,
            . . .
    FROM events e CROSS APPLY
         (VALUES (CASE WHEN Status = 'Active' THEN 1 ELSE 0 END),
                 (CASE WHEN Status = 'OnHold' THEN 1 ELSE 0 END),
                 (CASE WHEN AppType = 'WEB' THEN 1 ELSE 0 END),
                 (CASE WHEN AppType = 'MOBILE' THEN 1 ELSE 0 END),
                 . . .
         ) v(is_active, is_onhold, is_web, is_mobile)
    GROUP BY Task;
    

    This might affect performance if your comparisons are actually more cumbersome than simple string equality.