Search code examples
sqloptimizationindexingsql-execution-plan

Are SQL Execution Plans based on Schema or Data or both?


I hope this question is not too obvious...I have already found lots of good information on interpreting execution plans but there is one question I haven't found the answer to.

Is the plan (and more specifically the relative CPU cost) based on the schema only, or also the actual data currently in the database?

I am try to do some analysis of where indexes are needed in my product's database, but am working with my own test system which does not have close to the amount of data a product in the field would have. I am seeing some odd things like the estimated CPU cost actually going slightly UP after adding an index, and am wondering if this is because my data set is so small.

I am using SQL Server 2005 and Management Studio to do the plans


Solution

  • It will be based on both Schema and Data. The Schema tells it what indexes are available, the Data tells it which is better.

    The answer can change in small degrees depending on the DBMS you are using (you have not stated), but they all maintain statistics against indexes to know whether an index will help. If an index breaks 1000 rows into 900 distinct values, it is a good index to use. If an index only results in 3 different values for 1000 rows, it is not really selective so it is not very useful.