Search code examples
sqlunique-constraintunique-index

Why do unique indices exist at all in SQL?


I'm trying to understand the reason for unique indices in SQL.

A lot has been said about unique constraints vs. unique indices. In most SQL implementations there not only seems to be no practical difference whatsoever, but under the hood a unique constraint usually seems to be handled identically to a unique index.

But if

CREATE TABLE t1 (col1, col2 UNIQUE);

is similar to:

CREATE TABLE t1 (col1, col2);
CREATE UNIQUE INDEX t1_col2 ON t1 (col2);

in terms of how the UNIQUE constraint is being handled, then why do we have unique indices in the first place? They seem redundant. Yet all SQL implementations seem to have them, which means there is presumably a good reason for them to exist.

What is this reason?


Solution

  • SQL had many seeming redundancies. For instance:

    • SELECT DISTINCT can be replaced by GROUP BY.
    • COUNT(*) is equivalent to COUNT(1) and SUM(1). (In fact, COUNT() is unnecessary.)
    • CROSS JOIN is basically equivalent to , (the scoping rules are slightly different, but no one will notice).
    • CROSS JOIN is equivalent to JOIN . . . ON 1=1

    And there are many more examples.

    Standard SQL was accepted by committee. Different vendors had to agree on the standard. As with anything so designed, strange anomalies pop-up.

    I might also add that I think indexes were around before constraints, so the proper question is why unique constraints are needed.

    In the case of a unique constraint versus a unique index, the issue might be slightly different. The two are doing rather different things. A unique constraint is putting a constraint on columns in the table. A unique index is used for improving performance -- it is a variation on other types of indexes.

    In practice, a unique constraint is implemented by a unique index. Nothing in the standard requires that be the case, but it is the obvious solution.

    By the way, there is one difference between the two. A constraint can be given a name. This is handy because constraint violations generally use that name when an error occurs -- regardless of the type of constraint. In addition, it is also possible to list all the constraints on a table, without having to decipher particular types.