I have a question concerning best-practices for indexing in SQL Server (or any RDBMS for that matter). Take the following table:
ProfileID int
Text nvarchar(50)
ProfileID
is joined to a Profile
table. For each profile, each Text
must be unique. Therefore I put a Primary cover key on both columns. Fine.
However, I also want to be able to query the above table by ProfileID
. So I also put an index on ProfileID
too.
This means I have an overlapping index. I don't know if this is a total waste since there is a cover index already, or if it's correct since the cover index would be a hash of the two columns (or am I misunderstanding cover indexes)?
Edit:
I created the index in the order (ProfileID, Text)
. What if, for argument's sake, there were 3 columns A, B, and C, that had a cover index over all 3. Would it only benefit if we queried against "A" or "A, B, and C", but not "B", or "C", or "B and C"?
An index on (ProfileID, Text)
(in this order) is an index on ProfileID
as well.
You may still want to create an additional index on ProfileID
only, if you want greater SELECT
performance on queries that do not involve Text
.
However, this has two drawbacks:
Maintaining two indexes requires more resources and performance of DML
queries (INSERT
, UPDATE
, DELETE
) may suffer
If you mix queries of the two types, both indexes will occupy the cache and there can be more cache misses than with a single index.
It is not a problem if your table is small enough to fit into the cache along with both indexes.
The cover index would be a hash of the two columns (or am I misunderstanding cover indexes)?
A truly covering index would be created this way:
CREATE INDEX ix_mytable_profile__text ON mytable (ProfileID) INCLUDE (Text)
This way, Text
would only be stored in leaf-level nodes of the index.
However, since you need a UNIQUE
index, both columns need to be parts of the key. The nodes are sorted lexicographically on ProfileID
then Text
.
I created the index in the order (ProfileID, Text). What if, for argument's sake, there were 3 columns A, B, and C, that had a cover index over all 3. Would it only benefit if we queried against "A" or "A, B, and C", but not "B", or "C", or "B and C"?
CREATE INDEX ix_mytable_a_b_c ON mytable (a, b, c)
SELECT a, b, с
FROM mytable
WHERE a = 1
-- Index lookup, no table lookup. a is leading
SELECT a, b, с
FROM mytable
WHERE a = 1
AND b = 1
-- Index lookup, no table lookup. (a, b) are leading.
SELECT a, b, с
FROM mytable
WHERE b = 1
-- Index full scan (`b` is not leading), no table lookup
SELECT a, b, с
FROM mytable
WHERE c = 1
-- Index full scan (`c` is not leading), no table lookup
SELECT a, b, с, d
FROM mytable
WHERE a = 1
-- Index lookup, table tookup (d is not a part of the index).
SELECT a, b, с, d
FROM mytable
WHERE b = 1
-- Table full scan (there is no point in using index at all, neither for lookup nor for covering).