I have a Transaction table with about 200 million records, one primary key clustered on Id and 2 indexes:
I run these 2 statements before I proceed with the actual query to update statistics
Update STATISTICS dbo.[Transaction] IX_SiloId_ChangedTime_IncludeTime WITH FULLSCAN
Update STATISTICS dbo.[Transaction] IX_SiloId_Time_IncludeContent WITH FULLSCAN
This is my query:
DECLARE @Query SiloTimeQueryTableType -- (SiloId, Time) with primary key clustered on SiloId
INSERT INTO @Query VALUES
(1, '2020-12-31'), -- 1000 total values, though it's still the same problem with just one
SELECT t.*
FROM [Transaction] t
INNER JOIN @Query q
ON t.SiloId = q.SiloId
WHERE
t.Time >= q.Time
Now what happens is for whatever reason Sql Server choses IX_SiloId_ChangedTime_IncludeTime
. It then takes forever. If I use WITH (INDEX(IX_SiloId_Time_IncludeContent))
I get the result right away.
The correct index is quite obvious here, but yet SQL Server choses the one that is not even indexed on Time.
I cannot understand this behaviour, but from what I read it is best to avoid hints for Indexes, though I made this Index with this query in mind.
So the question is: what can I do to try to figure out why SQL Server prefers the "wrong" index even though a much better one exists and I just run full statistics update?
I have created an temporary table as many suggest that TVP fails, but the result is the same:
CREATE TABLE #Query
(
SiloId bigint NOT NULL PRIMARY KEY CLUSTERED,
Time datetime2(7) NOT NULL
)
Execution plans:
https://www.brentozar.com/pastetheplan/?id=rJOt3G00P
https://www.brentozar.com/pastetheplan/?id=ByFshGAAP (this one is live, as it takes too long)
Indices:
CREATE NONCLUSTERED INDEX [IX_SiloId_Time_IncludeContent] ON [dbo].[Transaction]
(
[SiloId] ASC,
[Time] ASC
)
INCLUDE([SiloContent]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_SiloId_ChangedTime_IncludeTime] ON [dbo].[Transaction]
(
[SiloId] ASC,
[ChangedTime] ASC
)
INCLUDE([Time]) WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF, ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
for whatever reason Sql Server choses IX_SiloId_ChangedTime_IncludeTime
That is not what the execution plan says. SQL Server chooses PK_Transaction
clustered index when no index hint is specified.
It is clear to me why SQL Server chooses PK_Transaction
instead of IX_SiloId_Time_IncludeContent
when looking at the execution plans. The reason is poor cardinality estimates. Both execution plans show that SQL Server estimates that the join operation produces 2.5182.000 rows, but it actually produces 4.155 rows. If SQL Server chooses IX_SiloId_Time_IncludeContent
then it estimates that it needs to perform 2.5182.000 key lookups. With 2.5182.000 key lookups using IX_SiloId_Time_IncludeContent
index the plan is more expensive than a plan with hash match and clustered index scan. If SQL Server was able to estimate better, it would choose IX_SiloId_Time_IncludeContent
because with only 4.155 key lookups the plan is much less costly.
So, what can you do?. I think on two options: