Search code examples
sqlsql-serverindexingsql-execution-plan

Why is SQL Server choosing the "wrong" Index?


I have a Transaction table with about 200 million records, one primary key clustered on Id and 2 indexes:

  • IX_SiloId_ChangedTime_IncludeTime
  • IX_SiloId_Time_IncludeContent

I run these 2 statements before I proceed with the actual query to update statistics

Update STATISTICS dbo.[Transaction] IX_SiloId_ChangedTime_IncludeTime WITH FULLSCAN
Update STATISTICS dbo.[Transaction] IX_SiloId_Time_IncludeContent WITH FULLSCAN

This is my query:

DECLARE @Query SiloTimeQueryTableType -- (SiloId, Time) with primary key clustered on SiloId
INSERT INTO @Query VALUES 
(1, '2020-12-31'), -- 1000 total values, though it's still the same problem with just one

SELECT  t.*
FROM    [Transaction] t
INNER JOIN @Query q
    ON t.SiloId = q.SiloId
WHERE 
    t.Time >= q.Time

Now what happens is for whatever reason Sql Server choses IX_SiloId_ChangedTime_IncludeTime. It then takes forever. If I use WITH (INDEX(IX_SiloId_Time_IncludeContent)) I get the result right away.

The correct index is quite obvious here, but yet SQL Server choses the one that is not even indexed on Time.

I cannot understand this behaviour, but from what I read it is best to avoid hints for Indexes, though I made this Index with this query in mind.

So the question is: what can I do to try to figure out why SQL Server prefers the "wrong" index even though a much better one exists and I just run full statistics update?

I have created an temporary table as many suggest that TVP fails, but the result is the same:

CREATE TABLE #Query
(
    SiloId bigint NOT NULL PRIMARY KEY CLUSTERED,
    Time datetime2(7) NOT NULL
)

Execution plans:

https://www.brentozar.com/pastetheplan/?id=rJOt3G00P

https://www.brentozar.com/pastetheplan/?id=ByFshGAAP (this one is live, as it takes too long)

Indices:

CREATE NONCLUSTERED INDEX [IX_SiloId_Time_IncludeContent] ON [dbo].[Transaction]
(
    [SiloId] ASC,
    [Time] ASC
)
INCLUDE([SiloContent]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_SiloId_ChangedTime_IncludeTime] ON [dbo].[Transaction]
(
    [SiloId] ASC,
    [ChangedTime] ASC
)
INCLUDE([Time]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

Solution

  • for whatever reason Sql Server choses IX_SiloId_ChangedTime_IncludeTime

    That is not what the execution plan says. SQL Server chooses PK_Transaction clustered index when no index hint is specified.

    It is clear to me why SQL Server chooses PK_Transaction instead of IX_SiloId_Time_IncludeContent when looking at the execution plans. The reason is poor cardinality estimates. Both execution plans show that SQL Server estimates that the join operation produces 2.5182.000 rows, but it actually produces 4.155 rows. If SQL Server chooses IX_SiloId_Time_IncludeContent then it estimates that it needs to perform 2.5182.000 key lookups. With 2.5182.000 key lookups using IX_SiloId_Time_IncludeContent index the plan is more expensive than a plan with hash match and clustered index scan. If SQL Server was able to estimate better, it would choose IX_SiloId_Time_IncludeContent because with only 4.155 key lookups the plan is much less costly.

    So, what can you do?. I think on two options:

    • Include the index hint. Index hints exist for a reason. Poor cardinality estimates is a good reason to include the hint.
    • Try with the covering index suggested by the first execution plan. With covering indexes, no key lookups are needed. So it is very likely that SQL Server chooses the covering index.