Search code examples
sqlsql-server-2012subquerycasedatabase-performance

WHERE-CASE clause Subquery Performance


The question can be specific to SQL server. When I write a query such as :

SELECT * FROM IndustryData WHERE Date='20131231'
AND ReportTypeID = CASE WHEN (fnQuarterDate('20131231')='20131231') THEN  1 
                        WHEN (fnQuarterDate('20131231')!='20131231') THEN  4
                        END;

Does the Function Call fnQuarterDate (or any Subquery) within Case inside a Where clause is executed for EACH row of the table ?

How would it be better if I get the function's (or any subquery) value beforehand inside a variable like:

DECLARE @X INT
IF fnQuarterDate('20131231')='20131231'
SET @X=1 
ELSE
SET @X=0
SELECT * FROM IndustryData WHERE Date='20131231'
AND ReportTypeID = CASE WHEN (@X = 1) THEN  1 
                        WHEN (@X = 0) THEN  4
                        END;

I know that in MySQL if there is a subquery inside IN(..) within a WHERE clause, it is executed for each row, I just wanted to find out the same for SQL SERVER.

...

Just populated table with about 30K rows and found out the Time Difference:

Query1= 70ms ; Query 2= 6ms. I think that explains it but still don't know the actual facts behind it.

Also would there be any difference if instead of a UDF there was a simple subquery ?


Solution

  • I think the solution may in theory help you increase the performance, but it also depends on what the scalar function actually does. I think that in this case (my guess is formatting the date to last day in the quarter) would really be negligible.

    You may want to read this page with suggested workarounds:

    http://connect.microsoft.com/SQLServer/feedback/details/273443/the-scalar-expression-function-would-speed-performance-while-keeping-the-benefits-of-functions#

    Because SQL Server must execute each function on every row, using any function incurs a cursor like performance penalty.

    And in Workarounds, there is a comment that

    I had the same problem when I used scalar UDF in join column, the performance was horrible. After I replaced the UDF with temp table that contains the results of UDF and used it in join clause, the performance was order of magnitudes better. MS team should fix UDF's to be more reliable.

    So it appears that yes, this may increase the performance.

    Your solution is correct, but I would recommend considering an improvement of the SQL to use ELSE instead, it looks cleaner to me:

    AND ReportTypeID = CASE WHEN (@X = 1) THEN  1 
                        ELSE 4
                        END;