Search code examples
sql-serverstored-proceduressql-optimization

SQL Server sproc query optimization


I've got an application that generates data for reports that look like:

                    age < 30   | age >=30  |   asian   | hispanic
-----------------------------------------------------------------
clients in prog A              |          |           |
-----------------------------------------------------------------
clients in prog B              |          |           |
-----------------------------------------------------------------
number clients                 |          |           |
-----------------------------------------------------------------
number children                |          |           |

The queries are sometimes very very long, and I'd like to optimize them.

I don't have permissions on the server to run the query analyzer (and I read that it's often better not to use it's suggestions). The longest sprocs take ~35 seconds to execute.

Reading around, the things to avoid for high query optimization are :

  • Select *
  • exists
  • distinct
  • cursors
  • having

I have a few questions about the task at hand:

  • how much of a difference am I looking at by changing Select * into Select colA, colB ... ? Is it really worth the trouble?
  • how can I optimize if exists( ... )? Is if( Select Count(query ) > 0 ) a good optimization?
  • If I am really going to return all of the columns in a table, is it okay to use Select * ?

I don't want to post these queries because they are so long and terrible, but what other suggestions might you be able to offer? I'm trying to use re-usable functions and temporary tables wherever possible to ease the strain both on my brain and on the server.


Solution

  • 1) how much of a difference am I looking at by changing Select * into Select colA, colB ... ? Is it really worth the trouble?
    That can make quite a big difference - it's always good practice generally to specify the fields you want and ONLY those fields. i.e. if you do a SELECT * to return 50 fields when you only need 2 of them, and those 2 fields are included in a suitable index then all the data can be provided from the index without having to look up the rest of the data from the data pages. So this is much better.

    2) how can I optimize if exists( ... )? Is if( Select Count(query ) > 0 ) a good optimization?
    No...SELECT COUNT() is worse. EXISTS is the most performant way to do this kind of thing as it is optimised to stop checking as soon as it finds the first matching record. Whereas COUNT() will keep going til it's found them all which is unnecessary. I wouldn't be classing "EXISTS" in the bad camp with cursors at all tbh.

    3) If I am really going to return all of the columns in a table, is it okay to use Select *?
    Well, if you truly want them all then it doesn't matter as much. That assumes if you want to add more columns in future then you also want those to also be returned which could break existing code if it suddenly changes.