Search code examples
sql-serveroptimizationquery-optimization

Why would using a temp table be faster than a nested query?


We are trying to optimize some of our queries.

One query is doing the following:

SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
INTO [#Gadget]
FROM task t

SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID) as Client 
FROM [#Gadget]
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC

DROP TABLE [#Gadget]

(I have removed the complex subquery. I don't think it's relevant other than to explain why this query has been done as a two stage process.)

I thought it would be far more efficient to merge this down into a single query using subqueries as:

SELECT TOP 500 TaskID, Task, Tracker, ClientID, dbo.GetClientDisplayName(ClientID)
FROM
(
    SELECT t.TaskID, t.Name as Task, '' as Tracker, t.ClientID, (<complex subquery>) Date,
    FROM task t
) as sub    
order by CASE WHEN Date IS NULL THEN 1 ELSE 0 END , Date ASC

This would give the optimizer better information to work out what was going on and avoid any temporary tables. I assumed it should be faster.

But it turns out it is a lot slower. 8 seconds vs. under 5 seconds.

I can't work out why this would be the case, as all my knowledge of databases imply that subqueries would always be faster than using temporary tables.

What am I missing?

Edit --

From what I have been able to see from the query plans, both are largely identical, except for the temporary table which has an extra "Table Insert" operation with a cost of 18%.

Obviously as it has two queries the cost of the Sort Top N is a lot higher in the second query than the cost of the Sort in the Subquery method, so it is difficult to make a direct comparison of the costs.

Everything I can see from the plans would indicate that the subquery method would be faster.


Solution

  • Obviously, SQL Server is choosing the wrong query plan. Yes, that can happen, I've had exactly the same scenario as you a few times.

    The problem is that optimizing a query (you mention a "complex subquery") is a non-trivial task: If you have n tables, there are roughly n! possible join orders -- and that's just the beginning. So, it's quite plausible that doing (a) first your inner query and (b) then your outer query is a good way to go, but SQL Server cannot deduce this information in reasonable time.

    What you can do is to help SQL Server. As Dan Tow writes in his great book "SQL Tuning", the key is usually the join order, going from the most selective to the least selective table. Using common sense (or the method described in his book, which is a lot better), you could determine which join order would be most appropriate and then use the FORCE ORDER query hint.

    Anyway, every query is unique, there is no "magic button" to make SQL Server faster. If you really want to find out what is going on, you need to look at (or show us) the query plans of your queries. Other interesting data is shown by SET STATISTICS IO, which will tell you how much (costly) HDD access your query produces.