Search code examples
sql-serverperformancedatabase-designindexingclustered-index

Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?


Comments to question "How to decrease response time of a simple select query?" tell:

  • "What is the data type on LaunchDate? An index isn't likely to do much if it's DATETIME or DATETIME2 because they include the time portion – OMG Ponies"

  • "@OMG - Why wouldn't a Clustered Index on a DateTime column improve performance? The query is a range scan which would allow for a fast range index lookup as all data would be in sequential blocks? Semi-related...msdn.microsoft.com/en-us/library/ms177416.aspx – Calgary Coder"

  • "Calgary Coder: DATETIME/2 includes time -- an index, clustered or non-clustered, would be good for dates with duplicate times but not ranges. – OMG Ponies"

I created a test table with clustered index on DATETIME type column LaunchDate and observe index seeks for queries similar to cited in above question:

SELECT COUNT(primaryKeyColumn) 
FROM   MarketPlan 
WHERE  LaunchDate > @date

instead of table or index scans.

Why wouldn't a clustered index on a DateTime column improve performance?
Why isn't index likely to do much if it's DATETIME or DATETIME2 because they include the time portion?

I'd appreciate a script illustrating that indexing of DATETIME column does not improve performance.

Update: Also, Did OMG imply that index on DATE type column would be helpful but not DATETIME and DATETIME2?


Solution

  • I've read the other question, no idea what OMG ponies means

    3 points:

    • It shouldn't matter if an index is clustered or non-clustered:
    • It doesn't matter whether time is included too
    • It just has to be useful

    Seek or scan:

    Based on statistics, if LaunchDate > @date means, say, 90% of the rows, then most likely a scan will happen. If it is quite selective, then a seek is more likely.

    Regardless of clustered or non-clustered!

    What index?

    A query like this would require an index on LaunchDate and primaryKeyColumn

    SELECT COUNT(primaryKeyColumn) 
    FROM   MarketPlan 
    WHERE  LaunchDate > @date
    

    Now, any non-clustered index refers to the clustered index which is assumed to the PK by default. So primaryKeyColumn is implicitly included already.

    Superstition

    However, COUNT(primaryKeyColumn) is a superstition. Because PKs do not allow NULL, it is equivalent to

    SELECT COUNT(*) 
    FROM   MarketPlan 
    WHERE  LaunchDate > @date
    
    SELECT COUNT(1) 
    FROM   MarketPlan 
    WHERE  LaunchDate > @date
    

    So you only need an index on LaunchDate, whether clustered or non-clustered