I have table structure as below
CREATE TABLE [dbo].[AIRQUALITYTS2]
(
[FeatureID] [nvarchar](20) NOT NULL,
[ParameterID] [nvarchar](20) NOT NULL,
[MeasurementDateTime] [datetime2](7) NOT NULL,
[ParameterValue] [numeric](38, 8) NULL,
[Remarks] [nvarchar](150) NULL,
CONSTRAINT [PK_AIRQUALITYTS2]
PRIMARY KEY CLUSTERED ([FeatureID] ASC, [ParameterID] ASC, [MeasurementDateTime] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
When I execute this query:
set statistics io on
SELECT
COUNT(featureid), featureid
FROM
AIRQUALITYTS2
WHERE
FeatureID LIKE 'AS%'
AND ParameterID = 'AP2'
AND YEAR(MeasurementDateTime) = 2015
GROUP BY
FeatureID
ORDER BY
FeatureID
I see the logical records 101871 and query execution plan is
But when I add non-clustered index on this table as
CREATE NONCLUSTERED INDEX non_fidpidmdate
ON [dbo].[AIRQUALITYTS2] ([ParameterID], [FeatureID])
INCLUDE ([MeasurementDateTime])
When I execute same query I see logical records reads 4636 only and is very fast and query execution plan is
Question 1: when there is less logical records in second query.
Question 2: Why first query is using clustered index scan as displayed in first image though it has clustered index on featureid,ParameterID and MeasurementDateTime while after adding non-cluster index it uses Index Seek (Non-Clustered) second image as displayed in images
Note: I have change where clause to
MeasurementDateTime >= '2004-01-01 00:00:00'
and MeasurementDateTime <= '2004-12-31 00:00:00'
to make it sargable but still the results are the same.
CREATE TABLE
where you create the PRIMARY KEY CLUSTERED
, it specifies the columns to cluster on, in the order they are clustered (stored) in. [FeatureID] [ParameterID] [MeasurementDateTime]
If you run a query with a WHERE
clause that includes a specific FeatureID
then it would be able to seek
to that part of the index.
But you haven't done that in the query.
You've used WHERE FeatureID LIKE 'AS%' ...
The query engine cannot seek, because that LIKE
with a trailing wildcard %
means it has to scan across all the FeatureIDs that start with the letters AS
and then within each of those nodes in the tree see if there are records that match ParameterID = 'AP2' AND YEAR(MeasurementDateTime) = 2015
.
[ParameterID] [FeatureID]
When you run the same query, it can seek
because you've specified an exact ParameterID
in the WHERE
clause.
Ordering is important! SQL Indexes are sortof B-Tree data structures, and you can't physically store them (or traverse them) in different orderings without creating multiple indexes. Creating too many indexes can be too much overhead for the database, so yes create ones that help the majority of your queries, but don't create too many. Mostly this involves knowing what sort of queries are frequently run against your database and tuning accordingly.