Situation: Azure SQL Server database (20 DTUs) with around 20% of space used. No other processes are running queries and the Compute utilization chart shows a solid line on 0%.
We have an empty table with around 80 columns. No indexes created. We have a query like this
SELECT column1,
column2,
...
columnN
FROM table
WHERE column1 = "some"
AND column2 = "something"
AND column3 = "something"
AND column4 = "something"
On the WHERE
clause, we are using 4 columns (one of them is a Timestamp
column).
When we run this query, after several minutes of running, we kill the process as it never ends, with the database CPU at 100%. As I said, the table is empty.
I created a non clustered index like this:
CREATE NONCLUSTERED INDEX [index_name]
ON [dbo].[table] ([column1] ASC, [column2] ASC,
[column3] ASC, [column4] ASC)
WITH (STATISTICS_NORECOMPUTE = OFF, DROP_EXISTING = OFF,
ONLINE = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
After creating the index, the same query takes several minutes to finish (again, with the CPU at 100%). The query execution plan just shows a table scan
. My first question is, why do we get a table scan
instead of an index scan
? The index contains the 4 columns that are included on the WHERE
clause.
However, if I create the same index but as a CLUSTERED
one, the query finishes almost instantly and the execution plan shows an index clustered seek
.
Why does a query over an empty table need a clustered index to conclude that is empty? Why it isn't enough to have a non clustered one?
Any help would be appreciated.
UPDATE.
The execution plan with just the non clustered index. Note: Since I already created a clustered index and dropped it to show this execution plan, now it is not an index scan anymore. It's an index seek. So it seems that the problem was indeed a heap issue.
The execution plan once the clustered index was created is here.
The estimated plans don't confirm it but this is almost certainly a case where the heap was once large and is still holding onto a lot of pages that haven't been deallocated since those rows were removed.
This symptom is particularly prevalent with heaps and, as Martin pointed out, Paul Randal wrote about it here:
Adding a clustered index resolves the issue because it has to completely rebuild the table, and at that point can finally deallocate the empty pages. And the symptom would still stay away even if you drop the clustered index and add a non-clustered index, because those deallocated pages are gone forever. However, if you keep just a non-clustered index, and fill the table up over time, and again clear it out, you're going to run into the same issue again.
Only adding a non-clustered index (without first creating a clustered index or otherwise forcing a rebuild) doesn't solve the problem because that doesn't rebuild the table like adding a clustered index does. So the pages still don't get deallocated.
You could also resolve the symptom by dropping the empty table and re-creating it (if it is in fact empty), or - whether it is empty or not - manually rebuilding it:
ALTER TABLE dbo.MyHeap REBUILD;
However, to prevent the symptom from happening at all, I just wouldn't use a heap. Create the clustered index and leave it there.