Search code examples
sql-serverindexingquery-optimization

Why removal of column from index improved performance?


I'm quite puzzled with the issue I just solved. Yes, I have it solved, but I still don't understand why the fix worked.

So, I have a SQL Server table. Rather big one, on the scale of billions of rows. It has a clustered index on two columns A (~millions of unique values) and B (~thousands unique values). AxB combination is not unique. It also has regular index on B.

There is basically the only query this table has to serve, which boils down to selecting some millions of rows based on certain given values of A and B:

WHERE 
    A IN (a1, a2, ... aN) 
    AND B IN (b1, b2, ... bM)

It also groups and aggregates results, but that's unrelated to the topic.

Anyway, under certain conditions performance of this query drastically slows down. I.e. just on one server few of hundreds of tests we have would run for 15 minutes while others similar usually fit under 10 seconds. Analysis of execution plan shows that it actually was scanning cluster index for both A and B and was doing this billions of times. "Fast" executions in fact entirely ignored B in both clustered and secondary indices, just scanning for A.

Based on this information I removed column B from the clustered index, and got the tests fixed.

All machines in question are running SQL Server 2022 Developer edition (actually all of them were installed just a few days ago), statistics refreshed right prior to execution of tests, even more, databases are effectively read-only, restored from the same backup. Available memory and CPUs are different.

Hence the question(s): why removing the second column from the index actually improved query performance? To the best of my understanding, it should have only made things worse, yet apparently it made things better.

Why was the SQL Server:

  1. ignoring that part of index in first place if it is actually present in both index and query? This index was designed with this query in mind and seems to be perfect for it. Or isn't it?

  2. why when it was using it, it was slowing everything down by orders of magnitude?

P.S. Here is the plan: https://www.brentozar.com/pastetheplan/?id=HJJXnyFQC

P.P.S. Apparently, there is a detail I considered unimportant, yet it is: number of As and Bs queried is quite significant (tens of thousands and hundreds to thousands respectively), and I pass them as json string which is in turn parsed with OPENJSON (or rather Entity Framework Core does that for me).


Solution

  • Thanks to martin-smith comments it become clear that real culprit was OPENJSON that was used as a mean to pass large list of ids from Entity Framework Core down to SQL Server.

    Apparently, Query Optimizer assumes only 50 elements in JSON, and when there are significantly more of them, execution plan starts having little common with reality.

    Respectively removal of the column from the index removed the option to use it and Query Optimizer become forced to use "slower" execution plan, which is in fact way faster with the large number of items in JSONs.