Search code examples
sql-servert-sqlquery-optimization

SQL Query Costing, aggregating a view is faster?


I have a table, Sheet1$ that contains 616 records. I have another table, Rates$ that contains 47880 records. Rates contains a response rate for a given record in the sheet for 90 days from a mailing date. Within all 90 days of a records Rates relation the total response is ALWAYS 1 (100%)

Example:

Sheet1$: Record 1, 1000 QTY, 5% Response, Mail 1/1/2009

Rates$: Record 1, Day 1, 2% Response
        Record 1, Day 2, 3% Response
     Record 1, Day 90, 1% Response
     Record N, Day N, N Response

So in that, I've written a view that takes these tables and joins them to the right on the rates to expand the data so I can perform some math to get a return per day for any given record.

SELECT s.[Mail Date] + r.Day as Mail_Date, s.Quantity * s.[Expected Response Rate] * r.Response as Pieces, s.[Bounce Back Card], s.Customer, s.[Point of Entry]
  FROM Sheet1$ as s
 RIGHT OUTER JOIN Rates$ as r
            ON s.[Appeal Code] = r.Appeal
 WHERE s.[Mail Date] IS NOT NULL 
   AND s.Quantity <> 0 
   AND s.[Expected Response Rate] <> 0
   AND s.Quantity IS NOT NULL 
   AND s.[Expected Response Rate] IS NOT NULL);

So I save this as a view called Test_Results. Using SQL Server Management Studio I run this query and get a result of 211,140 records. Elapsed time was 4.121 seconds, Est. Subtree Cost was 0.751.

Now I run a query against this view to aggregate a piece count on each day.

SELECT   Mail_Date, SUM(Pieces) AS Piececount
FROM     Test_Results
GROUP BY Mail_Date

That returns 773 rows and it only took 0.452 seconds to execute! 1.458 Est. Subtree Cost.

My question is, with a higher estimate how did this execute SO much faster than the original view itself?! I would assume a piece might be that its returning rows to management studio. If that is the case, how would I go about viewing the true cost of this query without having to account for the return feedback?


Solution

  • SELECT * FROM view1 will have a plan

    SELECT * FROM view2 (where view2 is based on view1) will have its own complete plan

    The optimizer is smart enough to make the plan for view2 combine/collapse the operations into a most efficient operation. It is only going to observe the semantics of the design of view1, but it is not necessarily required to use the plan for SELECT * FROM view1 and than apply another plan for view2 - this will, in general, be a completely different plan, and it will do whatever it can to get the most efficient results.

    Typically, it's going to push the aggregation down to improve the selectivity, and reduce the data requirements, and that's going to speed up the operation.