I'm trying to refactor some SQL code to make it more readable and maintainable; but, I don't want to destroy performance. I'm trying to move some column logic in a select statement into several functions, but am seeing huge decreases in performance. I'm hoping you all can help me understand why; and even better, how to fix it!
After refactoring my code looks roughly like the example below. Before refactoring, rather than function calls, the CASE statements were directly within the SUM functions in the select clause.
FUNCTION funcOne(@colA, @colB, @colC, @valX, @valY)
RETURNS INT AS
BEGIN
RETURN CASE
WHEN @colA = @colB
THEN @valX + @valY
WHEN @colC BETWEEN 1 AND 10
THEN @valX
ELSE 0
END
END
FUNCTION funcTwo(@colA, @colB, @colC, @valX, @valY)
RETURNS INT AS
BEGIN
RETURN CASE
WHEN @colA <> @colB
THEN @valX + @valY
WHEN @colC BETWEEN 1 AND 10
THEN @valY
ELSE 0
END
END
SELECT mt.[Ident]
,SUM(funcOne(mt.[colA], ot.[colB], ot.[colC], mt.[valX], ot.[valY])) AS funcOne
,SUM(funcTwo(mt.[colA], ot.[colB], ot.[colC], mt.[valX], ot.[valY])) AS funcTwo
FROM MyTable AS mt
INNER JOIN SomeOtherTable AS ot
ON mt.[Ident] = ot.[Ident]
WHERE mt.[colA] BETWEEN 1 AND 100
GROUP BY mt.[Ident]
Before refactoring the query takes about 60 seconds to run. After refactoring it takes nearly 7 minutes! The scan and read counts are identical, so it's strange to me that it takes so much longer.
What is SQL doing that makes it so inefficient after the refactor? Is there anyway to resolve this AND maintain my nice readable code?
Thanks for all the "why?" information, @conor-cunningham-msft.
In terms, of resolving the performance issue, I ended up using the suggestion from @Simonare and others.
Here's what my code looks like:
FUNCTION funcOne(@colA, @colB, @colC, @valX, @valY)
RETURNS TABLE AS
RETURN (
SELECT CASE
WHEN @colA = @colB
THEN @valX + @valY
WHEN @colC BETWEEN 1 AND 10
THEN @valX
ELSE 0
END AS [MyValue]
)
FUNCTION funcTwo(@colA, @colB, @colC, @valX, @valY)
RETURNS TABLE AS
RETURN (
SELECT CASE
WHEN @colA <> @colB
THEN @valX + @valY
WHEN @colC BETWEEN 1 AND 10
THEN @valY
ELSE 0
END AS [MyValue]
)
SELECT mt.[Ident]
,SUM(funcOne.[MyValue]) AS funcOneValue
,SUM(funcTwo.[MyValue]) AS funcTwoValue
FROM MyTable AS mt
INNER JOIN SomeOtherTable AS ot
ON mt.[Ident] = ot.[Ident]
CROSS APPLY funcOne(mt.[colA], ot.[colB], ot.[colC], mt.[valX], ot.[valY]) AS funcOne
CROSS APPLY funcTwo(mt.[colA], ot.[colB], ot.[colC], mt.[valX], ot.[valY]) AS funcTwo
WHERE mt.[colA] BETWEEN 1 AND 100
GROUP BY mt.[Ident]
This does perform a little slower than before refactoring; BUT the degradation is minimal and, I think, worth it for the sake of maintainability and readability.
scalar valued functions are generally bad practice in terms of performance. Lets say that you have function
CREATE FUNCTION fn_GetName(@CustomerID int)
RETURNS varchar(100)
AS
RETURN (
DECLARE @myResult VARCHAR(100);
SELECT @myResult = Fullname
FROM Sales s
WHERE s.CustomerID = @CustomerID
RETURN @myResult
)
and lets say that we are calling this function like
select
fn_GetName(id)
from Student;
SQL is interpreting this function row by row which causes performance bottleneck. However Table-valued functions does not do row-by-row operations,
CREATE FUNCTION fn_GetNameTable(@CustomerID int)
RETURNS TABLE
AS
RETURN (
SELECT Fullname
FROM Sales s
WHERE s.CustomerID = @CustomerID
)
Then,
SELECT I.[Customer Name]
,S.CustomerType
FROM Sales s
CROSS APPLY fn_GetNameTable(S.CustomerID) I
is SQL native.
You can read more from This addess