Search code examples
sqlsql-servernested-select

Are scalar subqueries in a group by a bad practice?


I have this query. Should the nested select be avoided ? Is there a better way ?

WITH cte(oi, oIdOf) AS (
    SELECT ROW_NUMBER() OVER (ORDER BY resIdOf), resIdOf
    FROM @res
    WHERE resIdOf<>0
    GROUP BY resIdOf    
)
INSERT INTO @fop
SELECT x.xIdOf
        ,x.xIdBe
        ,x.xLgnBe
        ,(SELECT e.BE_Commercial FROM BE_ENLEVEMENT AS e WHERE e.BE_Numero_BE=x.xIdBe)
        ,SUM(x.xCoeff)
FROM cte AS o
CROSS APPLY dbo.ft_grapheOfOrigine(o.oIdOf) AS x
GROUP BY x.xIdOf,x.xIdBe,x.xLgnBe;

Solution

  • I am not a fan of correlated subqueries with aggregation. The aggregation conditions can be tricky to get right, because the correlation conditions need to refer to the values after the aggregation.

    In fact, correlated subqueries can be quite useful, but the logic is often implemented using LEFT JOIN anyway.

    More importantly, it is really simple to rewrite them. So:

    SELECT x.xIdOf, x.xIdBe, x.xLgnBe,
           e.BE_Commercial,
           SUM(x.xCoeff)
    FROM cte o CROSS APPLY 
         dbo.ft_grapheOfOrigine(o.oIdOf) AS x LEFT JOIN
         BE_ENLEVEMENT e
         ON e.BE_Numero_BE = x.xIdBe
    GROUP BY x.xIdOf, x.xIdBe, x.xLgnBe, e.BE_Commercial;
    

    From a performance perspective, this is not exactly the same as your query, because the JOIN happens before the aggregation and there is an additional aggregation key. However, I think this would be a very minor impact on performance, given that it is already doing an aggregation.

    If this is an issue, you can use a subquery to get essentially the same execution plan.