Search code examples
sqloptimizationindexingsql-server-2005sql-execution-plan

Are SQL execution plans based on schema or data or both?


Is an SQL execution plan and relative CPU cost based on the schema only, or also the data in the database?

I am analyzing where indexes are needed in my product's database, but am working with my own test system which does not have close to the amount of data a product in the field would have. I am seeing some odd things like the estimated CPU cost going slightly UP after adding an index, and am wondering if this is because my data set is so small.

I am using SQL Server 2005 and Management Studio.


Solution

  • It will be based on both Schema and Data. The Schema tells it what indexes are available, the Data tells it which is better.

    The answer can change in small degrees depending on the DBMS you are using (you have not stated), but they all maintain statistics against indexes to know whether an index will help. If an index breaks 1000 rows into 900 distinct values, it is a good index to use. If an index only results in 3 different values for 1000 rows, it is not really selective so it is not very useful.